[PATCH v4] arm64: Enable KCSAN
Kefeng Wang
wangkefeng.wang at huawei.com
Tue Dec 14 20:08:02 PST 2021
On 2021/12/15 2:24, Mark Rutland wrote:
> On Sat, Dec 11, 2021 at 09:17:34PM +0800, Kefeng Wang wrote:
>> This patch enables KCSAN for arm64, with updates to build rules
>> to not use KCSAN for several incompatible compilation units.
>>
>> Recent GCC version(at least GCC10) made outline-atomics as the
>> default option(unlike Clang), which will cause linker errors
>> for kernel/kcsan/core.o. Disables the out-of-line atomics by
>> no-outline-atomics to fix the linker errors.
>>
>> Meanwhile, as Mark said[1], some latent issues are needed to be
>> fixed which isn't just a KCSAN problem, we make the KCSAN depends
>> on EXPERT for now.
>>
>> Tested selftest and kcsan_test(built with GCC11 and Clang 13),
>> and all passed.
>>
>> [1] https://lkml.kernel.org/r/YadiUPpJ0gADbiHQ@FVFF77S0Q05N
>> Acked-by: Marco Elver <elver at google.com> # kernel/kcsan
>> Tested-by: Joey Gouly <joey.gouly at arm.com>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang at huawei.com>
>> ---
>>
>> Tested on Qemu with clang 13 / gcc 11, based on 5.16-rc3 by Kefeng.
>> Tested with gcc 11 and clang 14.0 (built from git) on qemu and FVP by Joey.
> Hi,
>
> Sorry for the silence on v3; I got a little busy elsewhere and wasn't able to
> test that, but I have given this v4 a fairly intensive test with KCSAN combined
> with other config options I use when fuzzing.
>
> I eyeballed the output from a bunch of compilers (noted below), and there's
> still a fair amount of potentially-unsound instrumentation, but I'm happy with
> that so long as we have the EXPERT dependency. I agree we can fix those
> (latent) issues with follow-up work.
>
> I have one minor comment below about adding a comment, but aside from that,
> this looks good and I didn't encounter any unexpected issues while testing, so:
>
> Reviewed-by: Mark Rutland <mark.rutland at arm.com>
> Tested-by: Mark Rutland <mark.rutland at arm.com>
Many thanks for your , Mark.
>> v4:
>> - drop Clang version as commit 8cdd23c23c3d ("arm64: Restrict ARM64_BTI_KERNEL
>> to clang 12.0.0 and newer"), suggested by Nathan Chancellor
>> v3:
>> - add EXPERT and CLANG_VERSION depends suggested by Mark Rutland
>> v2:
>> - tested on GCC11 and disable outline-atomics for kernel/kcsan/core.c
>> suggested by Marco Elver
>>
>> arch/arm64/Kconfig | 1 +
>> arch/arm64/kernel/vdso/Makefile | 1 +
>> arch/arm64/kvm/hyp/nvhe/Makefile | 1 +
>> kernel/kcsan/Makefile | 1 +
>> 4 files changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 4ff73299f8a9..2cc9dea55e00 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -150,6 +150,7 @@ config ARM64
>> select HAVE_ARCH_KASAN_VMALLOC if HAVE_ARCH_KASAN
>> select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN
>> select HAVE_ARCH_KASAN_HW_TAGS if (HAVE_ARCH_KASAN && ARM64_MTE)
>> + select HAVE_ARCH_KCSAN if EXPERT
> It might be worth adding:
>
> /* Some instrumentation may be unsound */
>
> ... immediately above this to clarify why we added the EXPERT dependency here.
>
> Other than that trivial comment, this looks good to me!
>
> For future reference, below is an info dump about how I've tested this, and
> which latent issues KCSAN reported during testing. Feel free to ignore for now.
> :)
Our tester wants this feature to test our kernel, this is a good start
on ARM64,
more people could be getting involved to enhance it, We could fix the KCSAN
Warning one by one :)
>
> ... the summary of which is below:
>
> * BUG: KCSAN: data-race in capable / cgroup_freezer_migrate_task
> * BUG: KCSAN: data-race in console_unlock / vprintk_emit
> * BUG: KCSAN: data-race in __d_add / __d_add
> * BUG: KCSAN: data-race in __do_sys_prctl / do_task_stat
> * BUG: KCSAN: data-race in do_task_stat / sigprocmask
> * BUG: KCSAN: data-race in ep_modify / ep_poll_callback
> * BUG: KCSAN: data-race in _find_next_bit+0x60/0x194
> * BUG: KCSAN: data-race in folio_mark_accessed / workingset_activation
> * BUG: KCSAN: data-race in ktime_get / timekeeping_advance
> * BUG: KCSAN: data-race in more_used_split+0x38/0x60
> * BUG: KCSAN: data-race in mutex_spin_on_owner+0xcc/0x150
> * BUG: KCSAN: data-race in rwsem_spin_on_owner+0xa8/0x13c
> * BUG: KCSAN: data-race in start_dir_add+0x34/0xd4
> * BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_idle_stop_tick
> * BUG: KCSAN: data-race in timekeeping_delta_to_ns+0x34/0x94
> * BUG: KCSAN: data-race in virtqueue_get_buf_ctx+0xe8/0x458
> * BUG: KCSAN: data-race in virtqueue_get_buf_ctx_split+0x68/0x2dc
> * BUG: KCSAN: data-race in vprintk_emit+0x320/0x494
>
> By far the most commonly reported races were in "{mutex,rwsem}_spin_on_owner",
> followed by "timekeeping_delta_to_ns" followed by "virtqueue_get_buf_ctx*". So
> those look like the ones to focus on in the near term.
>
> When using GCC I would very often see reports I didn't see with LLVM, for
> "vprintk_emit" and "ktime_get / timekeeping_advance". I don't know if that's
> down to something getting optimized away or something getting instrumented
> differently.
>
More information about the linux-arm-kernel
mailing list