[PATCH 2/2] iommu: Extend LPAE page table format to support custom allocators

Boris Brezillon boris.brezillon at collabora.com
Fri Nov 10 01:52:31 PST 2023


On Wed, 20 Sep 2023 17:42:01 +0100
Robin Murphy <robin.murphy at arm.com> wrote:

> On 09/08/2023 1:17 pm, Boris Brezillon wrote:
> > We need that in order to implement the VM_BIND ioctl in the GPU driver
> > targeting new Mali GPUs.
> > 
> > VM_BIND is about executing MMU map/unmap requests asynchronously,
> > possibly after waiting for external dependencies encoded as dma_fences.
> > We intend to use the drm_sched framework to automate the dependency
> > tracking and VM job dequeuing logic, but this comes with its own set
> > of constraints, one of them being the fact we are not allowed to
> > allocate memory in the drm_gpu_scheduler_ops::run_job() to avoid this
> > sort of deadlocks:
> > 
> > - VM_BIND map job needs to allocate a page table to map some memory
> >    to the VM. No memory available, so kswapd is kicked
> > - GPU driver shrinker backend ends up waiting on the fence attached to
> >    the VM map job or any other job fence depending on this VM operation.
> > 
> > With custom allocators, we will be able to pre-reserve enough pages to
> > guarantee the map/unmap operations we queued will take place without
> > going through the system allocator. But we can also optimize
> > allocation/reservation by not free-ing pages immediately, so any
> > upcoming page table allocation requests can be serviced by some free
> > page table pool kept at the driver level.  
> 
> We should bear in mind it's also potentially valuable for other aspects 
> of GPU and similar use-cases, like fine-grained memory accounting and 
> resource limiting. That's a significant factor in this approach vs. 
> internal caching schemes that could only solve the specific reclaim concern.

I mentioned these other cases in v2. Let me know if that's not detailed
enough.

> > diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
> > index f4caf630638a..e273c18ae22b 100644
> > --- a/drivers/iommu/io-pgtable.c
> > +++ b/drivers/iommu/io-pgtable.c
> > @@ -47,6 +47,18 @@ static int check_custom_allocator(enum io_pgtable_fmt fmt,
> >   	if (!cfg->alloc)
> >   		return 0;
> >   
> > +	switch (fmt) {
> > +	case ARM_32_LPAE_S1:
> > +	case ARM_32_LPAE_S2:
> > +	case ARM_64_LPAE_S1:
> > +	case ARM_64_LPAE_S2:
> > +	case ARM_MALI_LPAE:
> > +		return 0;  
> 
> I remain not entirely convinced by the value of this, but could it at 
> least be done in a more scalable manner like some kind of flag provided 
> by the format itself?

I added a caps flag to io_pgtable_init_fns in v2. Feels a bit weird to
add a field that's not a function pointer in a struct that's prefixed
with _fns (which I guess stands for _functions), but oh well.

Regards,

Boris



More information about the linux-arm-kernel mailing list