[PATCHv2] arm64: Add atomic pool for non-coherent and CMA allocaitons.
Catalin Marinas
catalin.marinas at arm.com
Thu Jun 5 10:05:00 PDT 2014
Hi Laura,
On Mon, Jun 02, 2014 at 09:03:52PM +0100, Laura Abbott wrote:
> Neither CMA nor noncoherent allocations support atomic allocations.
> Add a dedicated atomic pool to support this.
CMA indeed doesn't support atomic allocations but swiotlb does, the only
problem being the vmap() to create a non-cacheable mapping. Could we not
use the atomic pool only for non-coherent allocations?
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
[...]
> static void *__dma_alloc_coherent(struct device *dev, size_t size,
> dma_addr_t *dma_handle, gfp_t flags,
> struct dma_attrs *attrs)
> @@ -53,7 +157,16 @@ static void *__dma_alloc_coherent(struct device *dev, size_t size,
> if (IS_ENABLED(CONFIG_ZONE_DMA) &&
> dev->coherent_dma_mask <= DMA_BIT_MASK(32))
> flags |= GFP_DMA;
> - if (IS_ENABLED(CONFIG_DMA_CMA)) {
So here just check for:
if ((flags & __GFP_WAIT) && IS_ENABLED(CONFIG_DMA_CMA)) {
> +
> + if (!(flags & __GFP_WAIT)) {
> + struct page *page = NULL;
> + void *addr = __alloc_from_pool(size, &page, true);
> +
> + if (addr)
> + *dma_handle = phys_to_dma(dev, page_to_phys(page));
> +
> + return addr;
and ignore the __alloc_from_pool() call.
> @@ -78,7 +191,9 @@ static void __dma_free_coherent(struct device *dev, size_t size,
> return;
> }
>
> - if (IS_ENABLED(CONFIG_DMA_CMA)) {
> + if (__free_from_pool(vaddr, size, true)) {
> + return;
> + } else if (IS_ENABLED(CONFIG_DMA_CMA)) {
> phys_addr_t paddr = dma_to_phys(dev, dma_handle);
>
> dma_release_from_contiguous(dev,
Here you check for the return value of dma_release_from_contiguous() and
if false, fall back to the swiotlb release.
I guess we don't even need the IS_ENABLED(DMA_CMA) check since when
disabled those functions return NULL/false anyway.
> @@ -100,9 +215,21 @@ static void *__dma_alloc_noncoherent(struct device *dev, size_t size,
> size = PAGE_ALIGN(size);
> order = get_order(size);
>
> + if (!(flags & __GFP_WAIT)) {
> + struct page *page = NULL;
> + void *addr = __alloc_from_pool(size, &page, false);
> +
> + if (addr)
> + *dma_handle = phys_to_dma(dev, page_to_phys(page));
> +
> + return addr;
> +
> + }
Here we need the atomic pool as we can't remap the memory as uncacheable
in atomic context.
> @@ -332,6 +461,65 @@ static struct notifier_block amba_bus_nb = {
>
> extern int swiotlb_late_init_with_default_size(size_t default_size);
>
> +static int __init atomic_pool_init(void)
> +{
> + struct dma_pool *pool = &atomic_pool;
> + pgprot_t prot = pgprot_writecombine(pgprot_default);
In linux-next I got rid of pgprot_default entirely, just use
__pgprot(PROT_NORMAL_NC).
> + unsigned long nr_pages = pool->size >> PAGE_SHIFT;
> + unsigned long *bitmap;
> + struct page *page;
> + struct page **pages;
> + int bitmap_size = BITS_TO_LONGS(nr_pages) * sizeof(long);
> +
> + bitmap = kzalloc(bitmap_size, GFP_KERNEL);
> + if (!bitmap)
> + goto no_bitmap;
> +
> + pages = kzalloc(nr_pages * sizeof(struct page *), GFP_KERNEL);
> + if (!pages)
> + goto no_pages;
> +
> + if (IS_ENABLED(CONFIG_CMA))
> + page = dma_alloc_from_contiguous(NULL, nr_pages,
> + get_order(pool->size));
> + else
> + page = alloc_pages(GFP_KERNEL, get_order(pool->size));
I think the safest is to use GFP_DMA as well. Without knowing exactly
what devices will do, what their dma masks are, I think that's a safer
bet. I plan to limit the CMA buffer to ZONE_DMA as well for lack of a
better option.
BTW, most of this code could be turned into a library, especially if we
don't need to separate coherent/non-coherent pools. Also, a lot of code
is similar to the dma_alloc_from_coherent() implementation (apart from
the ioremap() call in dma_declare_coherent_memory() and per-device pool
rather than global one).
--
Catalin
More information about the linux-arm-kernel
mailing list