regression: insmod module failed in VM with nvdimm on

Ard Biesheuvel ardb at kernel.org
Thu Dec 1 04:53:08 PST 2022


On Thu, 1 Dec 2022 at 13:07, chenxiang (M) <chenxiang66 at hisilicon.com> wrote:
>
>
>
> 在 2022/12/1 19:07, Ard Biesheuvel 写道:
> > On Thu, 1 Dec 2022 at 09:07, Ard Biesheuvel <ardb at kernel.org> wrote:
> >> On Thu, 1 Dec 2022 at 08:15, chenxiang (M) <chenxiang66 at hisilicon.com> wrote:
> >>> Hi Ard,
> >>>
> >>>
> >>> 在 2022/11/30 16:18, Ard Biesheuvel 写道:
> >>>> On Wed, 30 Nov 2022 at 08:53, Marc Zyngier <maz at kernel.org> wrote:
> >>>>> On Wed, 30 Nov 2022 02:52:35 +0000,
> >>>>> "chenxiang (M)" <chenxiang66 at hisilicon.com> wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> We boot the VM using following commands (with nvdimm on)  (qemu
> >>>>>> version 6.1.50, kernel 6.0-r4):
> >>>>> How relevant is the presence of the nvdimm? Do you observe the failure
> >>>>> without this?
> >>>>>
> >>>>>> qemu-system-aarch64 -machine
> >>>>>> virt,kernel_irqchip=on,gic-version=3,nvdimm=on  -kernel
> >>>>>> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios
> >>>>>> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m
> >>>>>> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0
> >>>>>> ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1'
> >>>>>> -object memory-backend-ram,id=ram1,size=10G -device
> >>>>>> nvdimm,id=dimm1,memdev=ram1  -device ioh3420,id=root_port1,chassis=1
> >>>>>> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1
> >>>>>>
> >>>>>> Then in VM we insmod a module, vmalloc error occurs as follows (kernel
> >>>>>> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4):
> >>>>>>
> >>>>>> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko
> >>>>>> [    8.186563] vmap allocation for size 20480 failed: use
> >>>>>> vmalloc=<size> to increase size
> >>>>> Have you tried increasing the vmalloc size to check that this is
> >>>>> indeed the problem?
> >>>>>
> >>>>> [...]
> >>>>>
> >>>>>> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr:
> >>>>>> defer initialization to initcall where permitted").
> >>>>> I guess you mean commit fc5a89f75d2a instead, right?
> >>>>>
> >>>>>> Do you have any idea about the issue?
> >>>>> I sort of suspect that the nvdimm gets vmap-ed and consumes a large
> >>>>> portion of the vmalloc space, but you give very little information
> >>>>> that could help here...
> >>>>>
> >>>> Ouch. I suspect what's going on here: that patch defers the
> >>>> randomization of the module region, so that we can decouple it from
> >>>> the very early init code.
> >>>>
> >>>> Obviously, it is happening too late now, and the randomized module
> >>>> region is overlapping with a vmalloc region that is in use by the time
> >>>> the randomization occurs.
> >>>>
> >>>> Does the below fix the issue?
> >>> The issue still occurs, but it seems decrease the probability, before it
> >>> occured almost every time, after the change, i tried 2-3 times, and it
> >>> occurs.
> >>> But i change back "subsys_initcall" to "core_initcall", and i test more
> >>> than 20 times, and it is still ok.
> >>>
> >> Thank you for confirming. I will send out a patch today.
> >>
> > ...but before I do that, could you please check whether the change
> > below fixes your issue as well?
>
> Yes, but i can only reply to you tomorrow as other guy is testing on the
> only environment today.
>

That is fine, thanks.



More information about the linux-arm-kernel mailing list