regression: insmod module failed in VM with nvdimm on

chenxiang (M) chenxiang66 at hisilicon.com
Thu Dec 1 04:06:18 PST 2022



在 2022/12/1 19:07, Ard Biesheuvel 写道:
> On Thu, 1 Dec 2022 at 09:07, Ard Biesheuvel <ardb at kernel.org> wrote:
>> On Thu, 1 Dec 2022 at 08:15, chenxiang (M) <chenxiang66 at hisilicon.com> wrote:
>>> Hi Ard,
>>>
>>>
>>> 在 2022/11/30 16:18, Ard Biesheuvel 写道:
>>>> On Wed, 30 Nov 2022 at 08:53, Marc Zyngier <maz at kernel.org> wrote:
>>>>> On Wed, 30 Nov 2022 02:52:35 +0000,
>>>>> "chenxiang (M)" <chenxiang66 at hisilicon.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> We boot the VM using following commands (with nvdimm on)  (qemu
>>>>>> version 6.1.50, kernel 6.0-r4):
>>>>> How relevant is the presence of the nvdimm? Do you observe the failure
>>>>> without this?
>>>>>
>>>>>> qemu-system-aarch64 -machine
>>>>>> virt,kernel_irqchip=on,gic-version=3,nvdimm=on  -kernel
>>>>>> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios
>>>>>> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m
>>>>>> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0
>>>>>> ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1'
>>>>>> -object memory-backend-ram,id=ram1,size=10G -device
>>>>>> nvdimm,id=dimm1,memdev=ram1  -device ioh3420,id=root_port1,chassis=1
>>>>>> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1
>>>>>>
>>>>>> Then in VM we insmod a module, vmalloc error occurs as follows (kernel
>>>>>> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4):
>>>>>>
>>>>>> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko
>>>>>> [    8.186563] vmap allocation for size 20480 failed: use
>>>>>> vmalloc=<size> to increase size
>>>>> Have you tried increasing the vmalloc size to check that this is
>>>>> indeed the problem?
>>>>>
>>>>> [...]
>>>>>
>>>>>> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr:
>>>>>> defer initialization to initcall where permitted").
>>>>> I guess you mean commit fc5a89f75d2a instead, right?
>>>>>
>>>>>> Do you have any idea about the issue?
>>>>> I sort of suspect that the nvdimm gets vmap-ed and consumes a large
>>>>> portion of the vmalloc space, but you give very little information
>>>>> that could help here...
>>>>>
>>>> Ouch. I suspect what's going on here: that patch defers the
>>>> randomization of the module region, so that we can decouple it from
>>>> the very early init code.
>>>>
>>>> Obviously, it is happening too late now, and the randomized module
>>>> region is overlapping with a vmalloc region that is in use by the time
>>>> the randomization occurs.
>>>>
>>>> Does the below fix the issue?
>>> The issue still occurs, but it seems decrease the probability, before it
>>> occured almost every time, after the change, i tried 2-3 times, and it
>>> occurs.
>>> But i change back "subsys_initcall" to "core_initcall", and i test more
>>> than 20 times, and it is still ok.
>>>
>> Thank you for confirming. I will send out a patch today.
>>
> ...but before I do that, could you please check whether the change
> below fixes your issue as well?

Yes, but i can only reply to you tomorrow as other guy is testing on the 
only environment today.

>
> diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
> index 6ccc7ef600e7c1e1..c8c205b630da1951 100644
> --- a/arch/arm64/kernel/kaslr.c
> +++ b/arch/arm64/kernel/kaslr.c
> @@ -20,7 +20,11 @@
>   #include <asm/sections.h>
>   #include <asm/setup.h>
>
> -u64 __ro_after_init module_alloc_base;
> +/*
> + * Set a reasonable default for module_alloc_base in case
> + * we end up running with module randomization disabled.
> + */
> +u64 __ro_after_init module_alloc_base = (u64)_etext - MODULES_VSIZE;
>   u16 __initdata memstart_offset_seed;
>
>   struct arm64_ftr_override kaslr_feature_override __initdata;
> @@ -30,12 +34,6 @@ static int __init kaslr_init(void)
>          u64 module_range;
>          u32 seed;
>
> -       /*
> -        * Set a reasonable default for module_alloc_base in case
> -        * we end up running with module randomization disabled.
> -        */
> -       module_alloc_base = (u64)_etext - MODULES_VSIZE;
> -
>          if (kaslr_feature_override.val & kaslr_feature_override.mask & 0xf) {
>                  pr_info("KASLR disabled on command line\n");
>                  return 0;
> .
>




More information about the linux-arm-kernel mailing list