[PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by per-NUMA CMA

Song Bao Hua (Barry Song) song.bao.hua at hisilicon.com
Fri Aug 21 16:47:42 EDT 2020



> -----Original Message-----
> From: Song Bao Hua (Barry Song)
> Sent: Saturday, August 22, 2020 7:27 AM
> To: 'Mike Kravetz' <mike.kravetz at oracle.com>; hch at lst.de;
> m.szyprowski at samsung.com; robin.murphy at arm.com; will at kernel.org;
> ganapatrao.kulkarni at cavium.com; catalin.marinas at arm.com;
> akpm at linux-foundation.org
> Cc: iommu at lists.linux-foundation.org; linux-arm-kernel at lists.infradead.org;
> linux-kernel at vger.kernel.org; Zengtao (B) <prime.zeng at hisilicon.com>;
> huangdaode <huangdaode at huawei.com>; Linuxarm <linuxarm at huawei.com>
> Subject: RE: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by
> per-NUMA CMA
> 
> 
> 
> > -----Original Message-----
> > From: Mike Kravetz [mailto:mike.kravetz at oracle.com]
> > Sent: Saturday, August 22, 2020 5:53 AM
> > To: Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>; hch at lst.de;
> > m.szyprowski at samsung.com; robin.murphy at arm.com; will at kernel.org;
> > ganapatrao.kulkarni at cavium.com; catalin.marinas at arm.com;
> > akpm at linux-foundation.org
> > Cc: iommu at lists.linux-foundation.org; linux-arm-kernel at lists.infradead.org;
> > linux-kernel at vger.kernel.org; Zengtao (B) <prime.zeng at hisilicon.com>;
> > huangdaode <huangdaode at huawei.com>; Linuxarm
> <linuxarm at huawei.com>
> > Subject: Re: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by
> > per-NUMA CMA
> >
> > Hi Barry,
> > Sorry for jumping in so late.
> >
> > On 8/21/20 4:33 AM, Barry Song wrote:
> > >
> > > with per-numa CMA, smmu will get memory from local numa node to save
> > command
> > > queues and page tables. that means dma_unmap latency will be shrunk
> > much.
> >
> > Since per-node CMA areas for hugetlb was introduced, I have been thinking
> > about the limited number of CMA areas.  In most configurations, I believe
> > it is limited to 7.  And, IIRC it is not something that can be changed at
> > runtime, you need to reconfig and rebuild to increase the number.  In
> contrast
> > some configs have NODES_SHIFT set to 10.  I wasn't too worried because of
> > the limited hugetlb use case.  However, this series is adding another user
> > of per-node CMA areas.
> >
> > With more users, should try to sync up number of CMA areas and number of
> > nodes?  Or, perhaps I am worrying about nothing?
> 
> Hi Mike,
> The current limitation is 8. If the server has 4 nodes and we enable both
> pernuma
> CMA and hugetlb, the last node will fail to get one cma area as the default
> global cma area will take 1 of 8. So users need to change menuconfig.
> If the server has 8 nodes, we enable one of pernuma cma and hugetlb, one
> node
> will fail to get cma.
> 
> We may set the default number of CMA areas as 8+MAX_NODES(if hugetlb
> enabled) +
> MAX_NODES(if pernuma cma enabled) if we don't expect users to change
> config, but
> right now hugetlb has not an option in Kconfig to enable or disable like
> pernuma cma
> has DMA_PERNUMA_CMA.

I would prefer we make some changes like:

config CMA_AREAS
	int "Maximum count of the CMA areas"
	depends on CMA
+	default 19 if NUMA
	default 7
	help
	  CMA allows to create CMA areas for particular purpose, mainly,
	  used as device private area. This parameter sets the maximum
	  number of CMA area in the system.

-	  If unsure, leave the default value "7".
+	  If unsure, leave the default value "7" or "19" if NUMA is used.

1+ CONFIG_CMA_AREAS should be quite enough for almost all servers in the markets.

If 2 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*2 + 1 = 5
If 4 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*4 + 1 = 9    -> default ARM64 config.
If 8 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*8 + 1 = 17

The default value is supporting the most common case and is not going to support those servers
with NODES_SHIFT=10, they can make their own config just like users need to increase CMA_AREAS
if they add many cma areas in device tree in a system even without NUMA.

How do you think, mike?

Thanks
Barry


More information about the linux-arm-kernel mailing list