Arm + KASAN + syzbot

Dmitry Vyukov dvyukov at google.com
Tue Jan 19 05:18:21 EST 2021


On Tue, Jan 19, 2021 at 9:41 AM Linus Walleij <linus.walleij at linaro.org> wrote:
> > We are considering setting up an Arm 32-bit instance on syzbot for
> > continuous testing using qemu emulation and I have several questions
> > related to that.
>
> That's interesting. I don't know much about syzbot but it reminds me
> of syzcaller.

Hi Linus,

Yes, these are related. syzkaller is a fuzzer, while syzbot is a
continuous fuzzing and bug reporting system.
It's like clang and a CI that uses clang to do continuous builds.


> > 1. Is there interest in this on your end? What git tree/branch should
> > be used for testing (contains latest development and is regularly
> > updated with fixes)?
>
> The most important would be Russell's branch I think, that is where
> the core architecture changes end up. They also land in linux-next.
>
> I think for the core developers this is the interesting tree,
> the corporate users mostly use KASAN for fuzzing their
> out-of-tree codebase and that is not of our concern. There can
> be some specific platforms we want to test but they mostly
> require real hardware because the interesting bugs tend to be
> in drivers and driver subsystems that only gets exercised on
> real hardware (not Qemu).

See my previous reply to Krzysztof re freshness of the tree.
I don't maybe it's just due to the winter holidays and usually it's
updated more frequently?


> > 2. I see KASAN has just become supported for Arm, which is very
> > useful, but I can't boot a kernel with KASAN enabled. I am using
> > v5.11-rc4 and this config without KASAN boots fine:
> > https://gist.githubusercontent.com/dvyukov/12de2905f9479ba2ebdcc603c2fec79b/raw/c8fd3f5e8328259fe760ce9a57f3e6c6f5a95c8f/gistfile1.txt
> > using the following qemu command line:
> > qemu-system-arm \
> >   -machine vexpress-a15 -cpu max -smp 2 -m 2G \
> >   -device virtio-blk-device,drive=hd0 \
> >   -drive if=none,format=raw,id=hd0,file=image-arm -snapshot \
> >   -kernel arch/arm/boot/zImage \
> >   -dtb arch/arm/boot/dts/vexpress-v2p-ca15-tc1.dtb \
> >   -nographic \
> >   -netdev user,host=10.0.2.10,hostfwd=tcp::10022-:22,id=net0 -device
> > virtio-net-device,netdev=net0 \
> >   -append "root=/dev/vda earlycon earlyprintk=serial console=ttyAMA0
> > oops=panic panic_on_warn=1 panic=86400 vmalloc=512M"
> >
> > However, when I enable KASAN and get this config:
> > https://gist.githubusercontent.com/dvyukov/a7e3edd35cc39a1b69b11530c7d2e7ac/raw/7cbda88085d3ccd11227224a1c9964ccb8484d4e/gistfile1.txt
> >
> > kernel does not boot, qemu only prints the following output and then silence:
> > pulseaudio: set_sink_input_volume() failed
> > pulseaudio: Reason: Invalid argument
> > pulseaudio: set_sink_input_mute() failed
> > pulseaudio: Reason: Invalid argument
> >
> > What am I doing wrong?
>
> I tried it with both KASAN_INLINE and KASAN_OUTLINE this
> morning on Torvald's tree and it works fine for me.
> I brought it up with this and it booted (takes ~30 seconds to come up
> on an i7).
>
> Here is my config:
> https://dflund.se/~triad/krad/vexpress_config.txt


See my previous reply to Krzysztof re syzbot configs. syzbot can't use
random configs.


> > 3. CONFIG_KCOV does not seem to fully work.
> > It seems to work except for when the kernel crashes, and that's the
> > most interesting scenario for us. When the kernel crashes for other
> > reasons, crash handlers re-crashe in KCOV making all crashes
> > unactionable and indistinguishable.
> > Here are some samples (search for __sanitizer_cov_trace):
> > https://gist.githubusercontent.com/dvyukov/c8a7ff1c00a5223c5143fd90073f5bc4/raw/c0f4ac7fd7faad7253843584fed8620ac6006338/gistfile1.txt
> > Perhaps some additional Makefiles in arch/arm need KCOV_INSTRUMENT :=
> > n to fix this.
> > And LKDTM can be used for testing:
> > https://www.kernel.org/doc/html/latest/fault-injection/provoke-crashes.html
>
> I have never use CONFIG_KCOV really, it's yet another universe
> that I haven't looked into.

I looked up who added KCOV/Arm support to CC... it's turned out to be me...

I think we need to disable KCOV instrumentation for arch/arm/mm/fault.c:
[<802b36ac>] (__sanitizer_cov_trace_const_cmp4) from [<801228c8>]
(do_DataAbort+0x60/0xe8 arch/arm/mm/fault.c:522)


But I am more concerned about the following stacks. These happen in
the generic kernel files. The most expected reason for crashes in KCOV
would be that current is not properly setup. So presumably these arm
thunks call into common kernel code without proper current being
setup. Other arches (x86_64, arm64) should do it properly, so maybe
it's fixable for arm as well?

[<802b3ffc>] (__sanitizer_cov_trace_pc) from [<802e12cc>]
(trace_hardirqs_off+0x14/0x120 kernel/trace/trace_preemptirq.c:76)
[<802e12b8>] (trace_hardirqs_off) from [<80100a74>]
(__dabt_svc+0x54/0xa0 arch/arm/kernel/entry-armv.S:194)

[<802b3460>] (write_comp_data) from [<802b3728>]
(__sanitizer_cov_trace_const_cmp8+0x40/0x48 kernel/kcov.c:291)
 r9:00000000 r8:8a2f00ac r7:00000000 r6:dead4ead r5:00000000 r4:00000000
[<802b36e8>] (__sanitizer_cov_trace_const_cmp8) from [<801f6cb8>]
(vprintk_func+0xf0/0x2ac kernel/printk/printk_safe.c:385)
 r7:83f885a4 r6:dead4ead r5:00000000 r4:844f2694
[<801f6bc8>] (vprintk_func) from [<8367b7b0>] (printk+0x40/0x68
kernel/printk/printk.c:2076)
 r9:8a2f0000 r8:8456390c r7:8a2f00f0 r6:00000367 r5:00000001 r4:83f885a4
[<8367b770>] (printk) from [<801228ec>] (do_DataAbort+0x84/0xe8
arch/arm/mm/fault.c:525)
 r3:dead4ead r2:00ad0000 r1:ffffffff r0:83f885a4
 r4:0000001b
[<80122868>] (do_DataAbort) from [<80100a7c>] (__dabt_svc+0x5c/0xa0
arch/arm/kernel/entry-armv.S:196)



More information about the linux-arm-kernel mailing list