[PATCH] dma-contiguous: support per-numa CMA for all architectures

Barry Song 21cnbao at gmail.com
Sun Jun 25 22:32:12 PDT 2023


On Sun, Jun 25, 2023 at 7:30 PM Yajun Deng <yajun.deng at linux.dev> wrote:
>
> June 24, 2023 8:40 AM, "Andrew Morton" <akpm at linux-foundation.org> wrote:
>
> > On Mon, 15 May 2023 13:38:21 +0200 Petr Tesařík <petr at tesarici.cz> wrote:
> >
> >> On Mon, 15 May 2023 11:23:27 +0000
> >> "Yajun Deng" <yajun.deng at linux.dev> wrote:
> >>
> >> May 15, 2023 5:49 PM, "Christoph Hellwig" <hch at lst.de> wrote:
> >
> > This looks fine to me. Can you please work with Barry to make sure
> > the slight different place of the initcall doesn't break anything
> > for his setup? I doubt it would, but I'd rather have a Tested-by:
> > tag.
> >> Barry's email is no longer in use. I can't reach him.
> >>
> >> Which one? I would hope that his Gmail account is still valid:
> >>
> >> Barry Song <21cnbao at gmail.com>
> >
> > Maybe his kernel.org address works...
> >
> > I have this patch stuck in limbo for 6.4. I guess I'll carry it over
> > into the next -rc cycle, see what happens.
> >
> > fwiw, it has been in -next for six weeks, no known issues.
>
> Hi, Barry, The slight different place of the initcall, does break anything?

i don't see a fundamental difference as anyway it is still after
arch_numa_init()
which is really what we depend on.

and i did a test on qemu with the command line:
qemu-system-aarch64 -M virt,gic-version=3 -nographic \
 -smp cpus=8 \
 -numa node,cpus=0-1,nodeid=0 \
 -numa node,cpus=2-3,nodeid=1 \
 -numa node,cpus=4-5,nodeid=2 \
 -numa node,cpus=6-7,nodeid=3 \
 -numa dist,src=0,dst=1,val=12 \
 -numa dist,src=0,dst=2,val=20 \
 -numa dist,src=0,dst=3,val=22 \
 -numa dist,src=1,dst=2,val=22 \
 -numa dist,src=2,dst=3,val=12 \
 -numa dist,src=1,dst=3,val=24 \
 -m 4096M -cpu cortex-a57 -kernel arch/arm64/boot/Image \
 -nographic -append "cma_pernuma=32M root=/dev/vda2  rw ip=dhcp
sched_debug irqchip.gicv3_pseudo_nmi=1" \
 -drive if=none,file=extra/ubuntu16.04-arm64.img,id=hd0 -device
virtio-blk-device,drive=hd0 \
 -net nic -net user,hostfwd=tcp::2222-:22

and in system, i can see all cma areas are correctly reserved:
~# dmesg | grep cma
[    0.000000] cma: cma_declare_contiguous_nid(size
0x0000000002000000, base 0x0000000000000000, limit 0x0000000000000000
alignment 0x0000000000000000)
[    0.000000] cma: Reserved 32 MiB at 0x000000007ce00000
[    0.000000] cma: dma_pernuma_cma_reserve: reserved 32 MiB on node 0
[    0.000000] cma: cma_declare_contiguous_nid(size
0x0000000002000000, base 0x0000000000000000, limit 0x0000000000000000
alignment 0x0000000000000000)
[    0.000000] cma: Reserved 32 MiB at 0x00000000bce00000
[    0.000000] cma: dma_pernuma_cma_reserve: reserved 32 MiB on node 1
[    0.000000] cma: cma_declare_contiguous_nid(size
0x0000000002000000, base 0x0000000000000000, limit 0x0000000000000000
alignment 0x0000000000000000)
[    0.000000] cma: Reserved 32 MiB at 0x00000000fce00000
[    0.000000] cma: dma_pernuma_cma_reserve: reserved 32 MiB on node 2
[    0.000000] cma: cma_declare_contiguous_nid(size
0x0000000002000000, base 0x0000000000000000, limit 0x0000000000000000
alignment 0x0000000000000000)
[    0.000000] cma: Reserved 32 MiB at 0x0000000100000000
[    0.000000] cma: dma_pernuma_cma_reserve: reserved 32 MiB on node 3
[    0.000000] cma: dma_contiguous_reserve(limit 100000000)
[    0.000000] cma: dma_contiguous_reserve: reserving 32 MiB for global area
[    0.000000] cma: cma_declare_contiguous_nid(size
0x0000000002000000, base 0x0000000000000000, limit 0x0000000100000000
alignment 0x0000000000000000)
[    0.000000] cma: Reserved 32 MiB at 0x00000000fae00000
[    0.000000] Kernel command line: cma_pernuma=32M root=/dev/vda2  rw
ip=dhcp sched_debug irqchip.gicv3_pseudo_nmi=1
[    0.000000] Memory: 3848784K/4194304K available (16128K kernel
code, 4152K rwdata, 10244K rodata, 8512K init, 612K bss, 181680K
reserved, 163840K cma-reserved)
[    0.175309] cma: cma_alloc(cma (____ptrval____), count 128, align 7)
[    0.179264] cma: cma_alloc(): returned (____ptrval____)
[    0.179869] cma: cma_alloc(cma (____ptrval____), count 128, align 7)
[    0.180027] cma: cma_alloc(): returned (____ptrval____)
[    0.180187] cma: cma_alloc(cma (____ptrval____), count 128, align 7)
[    0.180374] cma: cma_alloc(): returned (____ptrval____)

so my feeling is that this patch is fine. but I would prefer Yicong
and Tiantao who have a real numa machine
and we can get some real device drivers to call dma APIs to allocate
memory from pernuma cma on arm64
even though it is 99.9% OK.

With their testing done, please feel free to add
Acked-by: Barry Song <baohua at kernel.org>

Thanks
Barry



More information about the linux-arm-kernel mailing list