regression: insmod module failed in VM with nvdimm on

chenxiang (M) chenxiang66 at hisilicon.com
Wed Nov 30 23:15:04 PST 2022


Hi Ard,


在 2022/11/30 16:18, Ard Biesheuvel 写道:
> On Wed, 30 Nov 2022 at 08:53, Marc Zyngier <maz at kernel.org> wrote:
>> On Wed, 30 Nov 2022 02:52:35 +0000,
>> "chenxiang (M)" <chenxiang66 at hisilicon.com> wrote:
>>> Hi,
>>>
>>> We boot the VM using following commands (with nvdimm on)  (qemu
>>> version 6.1.50, kernel 6.0-r4):
>> How relevant is the presence of the nvdimm? Do you observe the failure
>> without this?
>>
>>> qemu-system-aarch64 -machine
>>> virt,kernel_irqchip=on,gic-version=3,nvdimm=on  -kernel
>>> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios
>>> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m
>>> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0
>>> ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1'
>>> -object memory-backend-ram,id=ram1,size=10G -device
>>> nvdimm,id=dimm1,memdev=ram1  -device ioh3420,id=root_port1,chassis=1
>>> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1
>>>
>>> Then in VM we insmod a module, vmalloc error occurs as follows (kernel
>>> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4):
>>>
>>> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko
>>> [    8.186563] vmap allocation for size 20480 failed: use
>>> vmalloc=<size> to increase size
>> Have you tried increasing the vmalloc size to check that this is
>> indeed the problem?
>>
>> [...]
>>
>>> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr:
>>> defer initialization to initcall where permitted").
>> I guess you mean commit fc5a89f75d2a instead, right?
>>
>>> Do you have any idea about the issue?
>> I sort of suspect that the nvdimm gets vmap-ed and consumes a large
>> portion of the vmalloc space, but you give very little information
>> that could help here...
>>
> Ouch. I suspect what's going on here: that patch defers the
> randomization of the module region, so that we can decouple it from
> the very early init code.
>
> Obviously, it is happening too late now, and the randomized module
> region is overlapping with a vmalloc region that is in use by the time
> the randomization occurs.
>
> Does the below fix the issue?

The issue still occurs, but it seems decrease the probability, before it 
occured almost every time, after the change, i tried 2-3 times, and it 
occurs.
But i change back "subsys_initcall" to "core_initcall", and i test more 
than 20 times, and it is still ok.

>
> diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
> index 37a9deed2aec..71fb18b2f304 100644
> --- a/arch/arm64/kernel/kaslr.c
> +++ b/arch/arm64/kernel/kaslr.c
> @@ -90,4 +90,4 @@ static int __init kaslr_init(void)
>
>          return 0;
>   }
> -subsys_initcall(kaslr_init)
> +arch_initcall(kaslr_init)
> .
>




More information about the linux-arm-kernel mailing list