[RFC 2/7] dma: Handle swiotlb throttling for SGLs

Petr Tesařík petr at tesarici.cz
Fri Aug 23 01:02:52 PDT 2024


On Thu, 22 Aug 2024 11:37:13 -0700
mhkelley58 at gmail.com wrote:

> From: Michael Kelley <mhklinux at outlook.com>
> 
> When a DMA map request is for a SGL, each SGL entry results in an
> independent mapping operation. If the mapping requires a bounce buffer
> due to running in a CoCo VM or due to swiotlb=force on the boot line,
> swiotlb is invoked. If swiotlb throttling is enabled for the request,
> each SGL entry results in a separate throttling operation. This is
> problematic because a thread may be holding swiotlb memory while waiting
> for memory to become free.
> 
> Resolve this problem by only allowing throttling on the 0th SGL
> entry. When unmapping the SGL, unmap entries 1 thru N-1 first, then
> unmap entry 0 so that the throttle isn't released until all swiotlb
> memory has been freed.
> 
> Signed-off-by: Michael Kelley <mhklinux at outlook.com>
> ---
> This approach to SGLs muddies the line between DMA direct and swiotlb
> throttling functionality. To keep the MAY_BLOCK attr fully generic, it
> should propagate to the mapping of all SGL entries.
> 
> An alternate approach is to define an additional DMA attribute that
> is internal to the DMA layer. Instead of clearing MAX_BLOCK, this
> attr is added by dma_direct_map_sg() when mapping SGL entries other
> than the 0th entry. swiotlb would do throttling only when MAY_BLOCK
> is set and this new attr is not set.
> 
> This approach has a modest amount of additional complexity. Given
> that we currently have no other users of the MAY_BLOCK attr, the
> conceptual cleanliness may not be warranted until we do.
> 
> Thoughts?

If we agree to change the unthrottling logic (see my comment to your
RFC 1/7), we'll need an additional attribute to delay unthrottling when
unmapping sg list entries 1 to N-1. This attribute could convey that
the mapping is the non-initial segment of an sg list and it could then
be also used to disable blocking in swiotlb_tbl_map_single().

> 
>  kernel/dma/direct.c | 35 ++++++++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 4480a3cd92e0..80e03c0838d4 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -438,6 +438,18 @@ void dma_direct_sync_sg_for_cpu(struct device *dev,
>  		arch_sync_dma_for_cpu_all();
>  }
>  
> +static void dma_direct_unmap_sgl_entry(struct device *dev,
> +		struct scatterlist *sgl, enum dma_data_direction dir,

Nitpick: This parameter should probably be called "sg", because it is
never used to do any operation on the whole list. Similarly, the
function could be called dma_direct_unmap_sg_entry(), because there is
no dma_direct_unmap_sgl() either...

> +		unsigned long attrs)
> +
> +{
> +	if (sg_dma_is_bus_address(sgl))
> +		sg_dma_unmark_bus_address(sgl);
> +	else
> +		dma_direct_unmap_page(dev, sgl->dma_address,
> +				      sg_dma_len(sgl), dir, attrs);
> +}
> +
>  /*
>   * Unmaps segments, except for ones marked as pci_p2pdma which do not
>   * require any further action as they contain a bus address.
> @@ -449,12 +461,20 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
>  	int i;
>  
>  	for_each_sg(sgl,  sg, nents, i) {
> -		if (sg_dma_is_bus_address(sg))
> -			sg_dma_unmark_bus_address(sg);
> -		else
> -			dma_direct_unmap_page(dev, sg->dma_address,
> -					      sg_dma_len(sg), dir, attrs);
> +		/*
> +		 * Skip the 0th SGL entry in case this SGL consists of
> +		 * throttled swiotlb mappings. In such a case, any other
> +		 * entries should be unmapped first since unmapping the
> +		 * 0th entry will release the throttle semaphore.
> +		 */
> +		if (!i)
> +			continue;
> +		dma_direct_unmap_sgl_entry(dev, sg, dir, attrs);
>  	}
> +
> +	/* Now do the 0th SGL entry */
> +	if (nents)

I wonder if nents can ever be zero here, but it's nowhere enforced and
dma_map_sg_attrs() is exported, so I agree, let's play it safe.

> +		dma_direct_unmap_sgl_entry(dev, sgl, dir, attrs);
>  }
>  #endif
>  
> @@ -492,6 +512,11 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
>  			ret = -EIO;
>  			goto out_unmap;
>  		}
> +
> +		/* Allow only the 0th SGL entry to block */
> +		if (!i)

Are you sure? I think the modified value of attrs is first used in the
next loop iteration, so the conditional should be removed, or else both
segment index 0 and 1 will keep the flag.

Petr T

> +			attrs &= ~DMA_ATTR_MAY_BLOCK;
> +
>  		sg_dma_len(sg) = sg->length;
>  	}
>  




More information about the Linux-nvme mailing list