[PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations

Vlastimil Babka vbabka at suse.cz
Thu Nov 3 09:15:51 PDT 2022


On 10/26/22 11:48, Catalin Marinas wrote:
>> > diff --git a/lib/kobject.c b/lib/kobject.c
>> > index a0b2dbfcfa23..2c4acb36925d 100644
>> > --- a/lib/kobject.c
>> > +++ b/lib/kobject.c
>> > @@ -144,7 +144,7 @@ char *kobject_get_path(struct kobject *kobj, gfp_t gfp_mask)
>> >  	len = get_kobj_path_length(kobj);
>> >  	if (len == 0)
>> >  		return NULL;
>> > -	path = kzalloc(len, gfp_mask);
>> > +	path = kzalloc(len, gfp_mask | __GFP_PACKED);
>> 
>> This might not be small, and it's going to be very very short-lived
>> (within a single function call), why does it need to be allocated this
>> way?
> 
> Regarding short-lived objects, you are right, they won't affect
> slabinfo. My ftrace-fu is not great, I only looked at the allocation
> hits and they keep adding up without counting how many are
> freed. So maybe we need tracing free() as well but not always easy to
> match against the allocation point and infer how many live objects there
> are.

BTW, since 6.1-rc1 we have a new way with slub_debug to determine how much
memory is wasted, thanks to commit 6edf2576a6cc ("mm/slub: enable debugging
memory wasting of kmalloc") by Feng Tang.

You need to boot the kernel with parameter such as:
slub_debug=U,kmalloc-64,kmalloc-128,kmalloc-192,kmalloc-256
(or just slub_debug=U,kmalloc-* for all sizes, but I guess you are
interested mainly in those that are affected by DMA alignment)
Note it does have some alloc/free CPU overhead and memory overhead, so not
intended for normal production.

Then you can check e.g.
cat /sys/kernel/debug/slab/kmalloc-128/alloc_traces | head -n 50
     77 set_kthread_struct+0x60/0x100 waste=1232/16 age=19492/31067/32465 pid=2 cpus=0-3
        __kmem_cache_alloc_node+0x102/0x340
        kmalloc_trace+0x26/0xa0
        set_kthread_struct+0x60/0x100
        copy_process+0x1903/0x2ee0
        kernel_clone+0xf4/0x4f0
        kernel_thread+0xae/0xe0
        kthreadd+0x491/0x500
        ret_from_fork+0x22/0x30

which tells you there are currently 77 live allocations with this exact
stack trace. The new information in 6.1 is the "waste=1232/16" which
means these allocations waste 16 bytes each due to rounding up to the
kmalloc cache size, or 1232 bytes in total (16*77). This should help
finding the prominent sources of waste.



More information about the linux-arm-kernel mailing list