[PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA

Song Bao Hua (Barry Song) song.bao.hua at hisilicon.com
Wed Jul 22 17:26:03 EDT 2020



> -----Original Message-----
> From: Christoph Hellwig [mailto:hch at lst.de]
> Sent: Thursday, July 23, 2020 2:17 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>
> Cc: hch at lst.de; m.szyprowski at samsung.com; robin.murphy at arm.com;
> will at kernel.org; ganapatrao.kulkarni at cavium.com;
> catalin.marinas at arm.com; iommu at lists.linux-foundation.org; Linuxarm
> <linuxarm at huawei.com>; linux-arm-kernel at lists.infradead.org;
> linux-kernel at vger.kernel.org; Jonathan Cameron
> <jonathan.cameron at huawei.com>; Nicolas Saenz Julienne
> <nsaenzjulienne at suse.de>; Steve Capper <steve.capper at arm.com>; Andrew
> Morton <akpm at linux-foundation.org>; Mike Rapoport <rppt at linux.ibm.com>
> Subject: Re: [PATCH v3 1/2] dma-direct: provide the ability to reserve
> per-numa CMA
> 

+cc Prime and Daode who are interested in this patchset.

> On Sun, Jun 28, 2020 at 11:12:50PM +1200, Barry Song wrote:
> > This is useful for at least two scenarios:
> > 1. ARM64 smmu will get memory from local numa node, it can save its
> > command queues and page tables locally. Tests show it can decrease
> > dma_unmap latency at lot. For example, without this patch, smmu on
> > node2 will get memory from node0 by calling dma_alloc_coherent(),
> > typically, it has to wait for more than 560ns for the completion of
> > CMD_SYNC in an empty command queue; with this patch, it needs 240ns
> > only.
> > 2. when we set iommu passthrough, drivers will get memory from CMA,
> > local memory means much less latency.
> 
> I really don't like the config options.  With the boot parameters
> you can always hardcode that in CONFIG_CMDLINE anyway.

I understand your concern. Anyway, The primary purpose of this patchset is providing
a general way for users like IOMMU to get local coherent dma buffers to put their
command queue and page tables in. The first user case is what really made me
begin to prepare this patchset.

For the second case, it is probably a positive side effect of this patchset for those users
who have more concern on performance than dma security, then they maybe skip
IOMMU by
	iommu.passthrough=
			[ARM64, X86] Configure DMA to bypass the IOMMU by default.
			Format: { "0" | "1" }
			0 - Use IOMMU translation for DMA.
			1 - Bypass the IOMMU for DMA.
			unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
In this case, they can get local memory and get better performance.
However, it is not the primary purpose of this patchset.

Thanks
Barry




More information about the linux-arm-kernel mailing list