[PATCH v2 2/2] treewide: Add the __GFP_PACKED flag to several non-DMA kmalloc() allocations

Catalin Marinas catalin.marinas at arm.com
Wed Oct 26 02:48:58 PDT 2022


On Wed, Oct 26, 2022 at 08:50:07AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Oct 25, 2022 at 09:52:47PM +0100, Catalin Marinas wrote:
> > diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> > index 4d59d927ae3e..bff8901dc426 100644
> > --- a/drivers/usb/core/message.c
> > +++ b/drivers/usb/core/message.c
> > @@ -525,7 +525,8 @@ int usb_sg_init(struct usb_sg_request *io, struct usb_device *dev,
> >  	}
> >  
> >  	/* initialize all the urbs we'll use */
> > -	io->urbs = kmalloc_array(io->entries, sizeof(*io->urbs), mem_flags);
> > +	io->urbs = kmalloc_array(io->entries, sizeof(*io->urbs),
> > +				 mem_flags | __GFP_PACKED);
> 
> Hey look, you just did what I was worried about :(
> 
> Oh wait, no, this is just the urb structure, not the actual data pointer
> sent on the urb.

Yeah, I agree it gets tricky to review such patches. My hope is that
driver writers won't start adding such flags everywhere, only where
there's a significant number of allocations. Better flag naming would
make this more obvious.

> Ick, this is going to get really hard to review over time.  I feel for
> the need to want to start to pack things in smaller, but this is going
> to be really really hard for maintainers to review submissions.
> Especially if the only way problems show up is in random platforms where
> the alignment causes a failure.
> 
> Anyway we can make all arches fail if we submit pointers that are not
> aligned properly to the DMA engines?

As I mentioned in a prior reply, we could add a warning comparing the
slab size with ARCH_DMA_MINALIGN (or cache_line_size(), though the
latter could trigger on fully-coherent architectures).

> > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> > index 63c7ebb0da89..e5ad1a3244fb 100644
> > --- a/fs/binfmt_elf.c
> > +++ b/fs/binfmt_elf.c
> > @@ -883,7 +883,8 @@ static int load_elf_binary(struct linux_binprm *bprm)
> >  			goto out_free_ph;
> >  
> >  		retval = -ENOMEM;
> > -		elf_interpreter = kmalloc(elf_ppnt->p_filesz, GFP_KERNEL);
> > +		elf_interpreter = kmalloc(elf_ppnt->p_filesz,
> > +					  GFP_KERNEL | __GFP_PACKED);
> 
> If this is going to now be sprinkled all over the tree, why not make it
> a "real" flag, "GFP_PACKED"?  Or better yet, have it describe the
> allocation better, "GFP_TINY" or
> "GFP_I_KNOW_THIS_IS_TINY_WITH_NO_DMA_ISSUES"

Linus originally suggested GFP_NODMA. We could go with that (and a
corresponding KMALLOC_NODMA_MINALIGN).

> > --- a/lib/kasprintf.c
> > +++ b/lib/kasprintf.c
> > @@ -22,7 +22,7 @@ char *kvasprintf(gfp_t gfp, const char *fmt, va_list ap)
> >  	first = vsnprintf(NULL, 0, fmt, aq);
> >  	va_end(aq);
> >  
> > -	p = kmalloc_track_caller(first+1, gfp);
> > +	p = kmalloc_track_caller(first+1, gfp | __GFP_PACKED);
> 
> How do we know this is going to be small?

We don't need to know it's small. If it's over 96 bytes on arm64, it
goes in the kmalloc-128 cache or higher. It can even use the kmalloc-192
cache that's not aligned to ARCH_DMA_MINALIGN (128). That's why I'd
avoid GFP_TINY as this flag is not about size but rather alignment (e.g.
192 may not be DMA safe but it's larger than 128).

That said, I should try to identify sizes > 128 and <= 192 and pass such
flag.

> > diff --git a/lib/kobject.c b/lib/kobject.c
> > index a0b2dbfcfa23..2c4acb36925d 100644
> > --- a/lib/kobject.c
> > +++ b/lib/kobject.c
> > @@ -144,7 +144,7 @@ char *kobject_get_path(struct kobject *kobj, gfp_t gfp_mask)
> >  	len = get_kobj_path_length(kobj);
> >  	if (len == 0)
> >  		return NULL;
> > -	path = kzalloc(len, gfp_mask);
> > +	path = kzalloc(len, gfp_mask | __GFP_PACKED);
> 
> This might not be small, and it's going to be very very short-lived
> (within a single function call), why does it need to be allocated this
> way?

Regarding short-lived objects, you are right, they won't affect
slabinfo. My ftrace-fu is not great, I only looked at the allocation
hits and they keep adding up without counting how many are
freed. So maybe we need tracing free() as well but not always easy to
match against the allocation point and infer how many live objects there
are.

> > @@ -749,7 +749,7 @@ static struct kobject *kobject_create(void)
> >  {
> >  	struct kobject *kobj;
> >  
> > -	kobj = kzalloc(sizeof(*kobj), GFP_KERNEL);
> > +	kobj = kzalloc(sizeof(*kobj), GFP_KERNEL | __GFP_PACKED);
> 
> This might make sense, but what type of size are you wanting to see
> these packed allocations require?  This is not all that tiny, but I
> guess it falls under the 96 bytes?  Where did that number come from?

With ARCH_DMA_MINALIGN of 128, normally we'd only have kmalloc-192,
kmalloc-256 and higher caches. With this patch I'd like to enable
kmalloc-{8,16,32,64,96,192}. So 96 is the size below 128 that would go
into a smaller kmalloc cache (and I missed 192 in my ftrace histogram).

-- 
Catalin



More information about the linux-arm-kernel mailing list