[PATCH v1 1/2] dma: return 0 from dma_opt_mapping_size() when no real hint exists

Robin Murphy robin.murphy at arm.com
Tue Mar 17 02:43:46 PDT 2026


On 2026-03-16 8:39 pm, Ionut Nechita (Wind River) wrote:
> From: Ionut Nechita <ionut.nechita at windriver.com>
> 
> dma_opt_mapping_size() currently initializes its local size to SIZE_MAX
> and, when neither an IOMMU nor a DMA ops opt_mapping_size callback is
> present, returns min(dma_max_mapping_size(dev), SIZE_MAX).  That value
> is a large but finite number that has nothing to do with an optimal
> transfer size — it is simply the maximum the DMA layer can map.

No, the current code is correct. dma_opt_mapping_size() represents the 
largest size that can be mapped without incurring any significant 
performance penalty (compared to smaller sizes). If the implementation 
has no such restriction, then the largest "efficient" size is quite 
obviously just the largest size in total.

> Callers such as scsi_transport_sas treat the return value as a genuine
> optimization hint and propagate it into Scsi_Host.opt_sectors, which in
> turn becomes the block device's optimal_io_size.  On SAS controllers
> like mpt3sas running with IOMMU in passthrough mode the bogus value
> (max_sectors << 9 = 16776704, rounded to 16773120) reaches mkfs.xfs,
> which computes swidth=4095 and sunit=2.  Because 4095 is not a multiple
> of 2, XFS rejects the geometry with "SB stripe unit sanity check
> failed", making it impossible to create filesystems during system
> bootstrap.

And that is obviously a bug. There has never been any guarantee offered 
about the values returned by either dma_max_mapping_size() or 
dma_opt_mapping_size() - they could be very large, very small, and 
certainly do not have to be powers of 2. Say an implementation has some 
internal data size optimisation that makes U32_MAX its largest 
"efficient" size, it's free to return that, and then you'll still have 
the same bug regardless of this bodge.

Fix the actual bug, don't break common code in an attempt to paper over 
it that doesn't even achieve that very well.

Thanks,
Robin.

> Fix this by returning 0 when no backend provides an optimal mapping size
> hint.  A return value of 0 unambiguously means "no preference" and lets
> callers that use min() or min_not_zero() do the right thing without
> special-casing.
> 
> The only other in-tree caller (nvme-pci) is adjusted in the next patch.
> 
> Fixes: a229cc14f339 ("dma-mapping: add dma_opt_mapping_size()")
> Cc: stable at vger.kernel.org
> Signed-off-by: Ionut Nechita <ionut.nechita at windriver.com>
> ---
>   kernel/dma/mapping.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index 78d8b4039c3e6..fffa6a3f191a3 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -984,14 +984,17 @@ EXPORT_SYMBOL_GPL(dma_max_mapping_size);
>   size_t dma_opt_mapping_size(struct device *dev)
>   {
>   	const struct dma_map_ops *ops = get_dma_ops(dev);
> -	size_t size = SIZE_MAX;
>   
>   	if (use_dma_iommu(dev))
> -		size = iommu_dma_opt_mapping_size();
> -	else if (ops && ops->opt_mapping_size)
> -		size = ops->opt_mapping_size();
> +		return iommu_dma_opt_mapping_size();
> +	if (ops && ops->opt_mapping_size)
> +		return ops->opt_mapping_size();
>   
> -	return min(dma_max_mapping_size(dev), size);
> +	/*
> +	 * No backend provided an optimal size hint. Return 0 so that
> +	 * callers can distinguish "no hint" from a real value.
> +	 */
> +	return 0;
>   }
>   EXPORT_SYMBOL_GPL(dma_opt_mapping_size);
>   




More information about the Linux-nvme mailing list