riscv+KASAN does not boot

Alex Ghiti alex at ghiti.fr
Wed Feb 17 11:36:40 EST 2021


Le 2/16/21 à 11:42 PM, Dmitry Vyukov a écrit :
> On Tue, Feb 16, 2021 at 9:42 PM Alex Ghiti <alex at ghiti.fr> wrote:
>>
>> Hi Dmitry,
>>
>> Le 2/16/21 à 6:25 AM, Dmitry Vyukov a écrit :
>>> On Tue, Feb 16, 2021 at 12:17 PM Dmitry Vyukov <dvyukov at google.com> wrote:
>>>>
>>>> On Fri, Jan 29, 2021 at 9:11 AM Dmitry Vyukov <dvyukov at google.com> wrote:
>>>>>> I was fixing KASAN support for my sv48 patchset so I took a look at your
>>>>>> issue: I built a kernel on top of the branch riscv/fixes using
>>>>>> https://github.com/google/syzkaller/blob/269d24e857a757d09a898086a2fa6fa5d827c3e1/dashboard/config/linux/upstream-riscv64-kasan.config
>>>>>> and Buildroot 2020.11. I have the warnings regarding the use of
>>>>>> __virt_to_phys on wrong addresses (but that's normal since this function
>>>>>> is used in virt_addr_valid) but not the segfaults you describe.
>>>>>
>>>>> Hi Alex,
>>>>>
>>>>> Let me try to rebuild buildroot image. Maybe there was something wrong
>>>>> with my build, though, I did 'make clean' before doing. But at the
>>>>> same time it worked back in June...
>>>>>
>>>>> Re WARNINGs, they indicate kernel bugs. I am working on setting up a
>>>>> syzbot instance on riscv. If there a WARNING during boot then the
>>>>> kernel will be marked as broken. No further testing will happen.
>>>>> Is it a mis-use of WARN_ON? If so, could anybody please remove it or
>>>>> replace it with pr_err.
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I've localized one issue with riscv/KASAN:
>>>> KASAN breaks VDSO and that's I think the root cause of weird faults I
>>>> saw earlier. The following patch fixes it.
>>>> Could somebody please upstream this fix? I don't know how to add/run
>>>> tests for this.
>>>> Thanks
>>>>
>>>> diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile
>>>> index 0cfd6da784f84..cf3a383c1799d 100644
>>>> --- a/arch/riscv/kernel/vdso/Makefile
>>>> +++ b/arch/riscv/kernel/vdso/Makefile
>>>> @@ -35,6 +35,7 @@ CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os
>>>>    # Disable gcov profiling for VDSO code
>>>>    GCOV_PROFILE := n
>>>>    KCOV_INSTRUMENT := n
>>>> +KASAN_SANITIZE := n
>>>>
>>>>    # Force dependency
>>>>    $(obj)/vdso.o: $(obj)/vdso.so
>>
>> What's weird is that I don't have any issue without this patch with the
>> following config whereas it indeed seems required for KASAN. But when
>> looking at the segfaults you got earlier, the segfault address is 0xbb0
>> and the cause is an instruction page fault: this address is the PLT base
>> address in vdso.so and an instruction page fault would mean that someone
>> tried to jump at this address, which is weird. At first sight, that does
>> not seem related to your patch above, but clearly I may be wrong.
>>
>> Tobias, did you observe the same segfaults as Dmitry ?
> 
> 
> I noticed that not all buildroot images use VDSO, it seems to be
> dependent on libc settings (at least I think I changed it in the
> past).

Ok, I used uClibc but then when using glibc, I have the same segfaults, 
only when KASAN is enabled. And your patch fixes the problem. I will try 
to take a look later to better understand the problem.

> I also booted an image completely successfully including dhcpd/sshd
> start, but then my executable crashed in clock_gettime. The executable
> was build on linux/amd64 host with "riscv64-linux-gnu-gcc -static"
> (10.2.1).
> 
> 
>>> Second issue I am seeing seems to be related to text segment size.
>>> I check out v5.11 and use this config:
>>> https://gist.github.com/dvyukov/6af25474d455437577a84213b0cc9178
>>
>> This config gave my laptop a hard time ! Finally I was able to boot
>> correctly to userspace, but I realized I used my sv48 branch...Either I
>> fixed your issue along the way or I can't reproduce it, I'll give it a
>> try tomorrow.
> 
> Where is your branch? I could also test in my setup on your branch.
> 

You can find my branch int/alex/riscv_kernel_end_of_address_space_v2 
here: https://github.com/AlexGhiti/riscv-linux.git

Thanks,

> 
>>> Then trying to boot it using:
>>> QEMU emulator version 5.2.0 (Debian 1:5.2+dfsg-3)
>>> $ qemu-system-riscv64 -machine virt -smp 2 -m 4G ...
>>>
>>> It shows no output from the kernel whatsoever, even though I have
>>> earlycon and output shows very early with other configs.
>>> Kernel boots fine with defconfig and other smaller configs.
>>>
>>> If I enable KASAN_OUTLINE and CC_OPTIMIZE_FOR_SIZE, then this config
>>> also boots fine. Both of these options significantly reduce kernel
>>> size. However, I can also boot the kernel without these 2 configs, if
>>> I disable a whole lot of subsystem configs. This makes me think that
>>> there is an issue related to kernel size somewhere in
>>> qemu/bootloader/kernel bootstrap code.
>>> Does it make sense to you? Can somebody reproduce what I am seeing? >
>>
>> I did not bring any answer to your question, but at least you know I'm
>> working on it, I'll keep you posted.
>>
>> Thanks for taking the time to setup syzkaller.
>>
>> Alex
>>
>>> Thanks
>>>
>>> _______________________________________________
>>> linux-riscv mailing list
>>> linux-riscv at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>>>
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
> 



More information about the linux-riscv mailing list