[RFC PATCH v2 00/21] KCFI support

Sedat Dilek sedat.dilek at gmail.com
Thu May 19 02:01:40 PDT 2022


On Tue, May 17, 2022 at 8:49 PM Nathan Chancellor <nathan at kernel.org> wrote:
>
> On Tue, May 17, 2022 at 09:33:47AM +0200, Sedat Dilek wrote:
> > Is your tags/kcfi-rfc-v2 the recommended LLVM toolchain for testing kcfi-rfc-v2?
> >
> > What I want to ask is if your commit well tested for x86 (and arm64)
> > means build and boot on bare metal?
>
> I have run kCFI (v1, I haven't had time to test v2) on x86_64 hardware,
> both AMD and Intel. I have only found two failures so far: the i915
> issue that I mention below and a failure in the KVM subsystem, which I
> can see by just running QEMU:
>

Is there a fix around this issue?

> [  124.344654] CFI failure at kvm_mmu_notifier_invalidate_range_end+0xca/0x177 [kvm] (target: kvm_null_fn+0x0/0x7 [kvm]; expected type: 0x1e2c5b9c)
> [  124.344691] WARNING: CPU: 5 PID: 2767 at kvm_mmu_notifier_invalidate_range_end+0xca/0x177 [kvm]
> [  124.344708] Modules linked in: ...
> [  124.344737] CPU: 5 PID: 2767 Comm: qemu-system-x86 Tainted: P                  5.18.0-rc6-debug-00033-g1d5284aff7cd #1 8c4966c7fb24f3345076747feebac5fb340a4797
> [  124.344738] Hardware name: ASUS System Product Name/PRIME Z590M-PLUS, BIOS 1203 10/27/2021
> [  124.344738] RIP: 0010:kvm_mmu_notifier_invalidate_range_end+0xca/0x177 [kvm]
> [  124.344756] Code: 48 89 da e8 88 16 86 dd 48 85 c0 89 6c 24 0c 74 30 4d 85 f6 74 36 48 c7 c5 19 c8 b4 c0 4c 8b 34 24 81 7d fa 9c 5b 2c 1e 74 02 <0f> 0b 48 89 c7 4c 89 fe 48 89 da e8 b6 16 86 dd 48 85 c0 75 e2 eb
> [  124.344757] RSP: 0018:ffffc033039d78e8 EFLAGS: 00010282
> [  124.344758] RAX: ffffa0a141ed9450 RBX: 00007fb897e9efff RCX: ffffa0a143c9d850
> [  124.344758] RDX: 00007fb897e9efff RSI: 00007fb897e9e000 RDI: ffffc03304499cf8
> [  124.344759] RBP: ffffffffc0b4c819 R08: 0000000000000000 R09: 0000000000000000
> [  124.344759] R10: 0000000000000000 R11: ffffffffc0b4c639 R12: ffffc03304499000
> [  124.344760] R13: ffffc033044a30a0 R14: ffffc033044a3120 R15: 00007fb897e9e000
> [  124.344760] FS:  00007fbca4bc3640(0000) GS:ffffa0a87f540000(0000) knlGS:0000000000000000
> [  124.344761] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  124.344761] CR2: 0000000000000000 CR3: 0000000119f74006 CR4: 0000000000772ee0
> [  124.344762] PKRU: 55555554
> [  124.344762] Call Trace:
> [  124.344762]  <TASK>
> [  124.344763]  __mmu_notifier_invalidate_range_end+0xa1/0xd7
> [  124.344764]  wp_page_copy+0x592/0x920
> [  124.344766]  __handle_mm_fault+0x820/0x8f0
> [  124.344767]  handle_mm_fault+0xe0/0x227
> [  124.344768]  __get_user_pages+0x17a/0x430
> [  124.344769]  get_user_pages_unlocked+0xd9/0x327
> [  124.344770]  hva_to_pfn+0xfa/0x3f7 [kvm 950e483805e01f967a3e331506278d766494a4cc]
> [  124.344788]  kvm_faultin_pfn+0xc3/0x2f0 [kvm 950e483805e01f967a3e331506278d766494a4cc]
> [  124.344806]  ? fast_page_fault+0x400/0x4c0 [kvm 950e483805e01f967a3e331506278d766494a4cc]
> [  124.344823]  direct_page_fault+0x130/0x350 [kvm 950e483805e01f967a3e331506278d766494a4cc]
> [  124.344841]  kvm_mmu_page_fault+0x176/0x2f7 [kvm 950e483805e01f967a3e331506278d766494a4cc]
> [  124.344858]  vmx_handle_exit+0xe/0x37 [kvm_intel 647f514f18fc365045dc57db46d2bb4930b746eb]
> [  124.344863]  vcpu_enter_guest+0xbff/0x1030 [kvm 950e483805e01f967a3e331506278d766494a4cc]
> [  124.344881]  vcpu_run+0x65/0x2f0 [kvm 950e483805e01f967a3e331506278d766494a4cc]
> [  124.344898]  kvm_arch_vcpu_ioctl_run+0x15f/0x3f7 [kvm 950e483805e01f967a3e331506278d766494a4cc]
> [  124.344916]  kvm_vcpu_ioctl+0x547/0x627 [kvm 950e483805e01f967a3e331506278d766494a4cc]
> [  124.344933]  ? syscall_exit_to_user_mode+0x24/0x47
> [  124.344934]  ? do_syscall_64+0x7a/0x97
> [  124.344935]  ? __fget_files+0xa1/0xc0
> [  124.344936]  __se_sys_ioctl+0x7c/0xc0
> [  124.344937]  do_syscall_64+0x6c/0x97
> [  124.344938]  ? do_syscall_64+0x7a/0x97
> [  124.344939]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  124.344940] RIP: 0033:0x7fbca75f7b1f
> [  124.344940] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
> [  124.344941] RSP: 002b:00007fbca4bc2550 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [  124.344942] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007fbca75f7b1f
> [  124.344942] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000e
> [  124.344943] RBP: 000056351ba661d0 R08: 000056351a74dc48 R09: 0000000000000004
> [  124.344943] R10: 004818cf4f129e11 R11: 0000000000000246 R12: 0000000000000000
> [  124.344943] R13: 0000000000000000 R14: 00007fbca814f004 R15: 0000000000000000
> [  124.344944]  </TASK>
> [  124.344945] ---[ end trace 0000000000000000 ]---
>
> I don't do any bare metal testing for the tc-build known good revision
> bumps, just QEMU tests with boot-utils.
>
> > Just for the records:
> > You definitely need a pre-LLVM-15 toolchain + KCFI sanitizer patch?
> > LLVM-14?
>
> The feature won't land in LLVM 14, so there is little point to
> backporting and testing it on LLVM 14. This is a work in progress
> feature so it has to target the main branch. I have fairly good coverage
> of tip of tree locally and we have solid coverage through CI so we'll
> know if something major happens, it should be pretty safe to test Sami's
> series.
>

OK, no backport.

> > > > Nathan has a i915 cfi patch in His personal kernel.org Git.
> > > > Is this relevant to kcfi?
> > >
> > > It fixes a type mismatch, so in that sense it's relevant.
> > >
> >
> > Here the link to patch "drm/i915: Fix CFI violation with show_dynamic_id()":
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/nathan/linux.git/commit/?h=submitted/i915-cfi-fix&id=53735be6dc53453fcfbac658e847b54360e73871
>
> This is now in the i915 tree so that branch is going to disappear:
>
> https://cgit.freedesktop.org/drm/drm-intel/commit/?id=18fb42db05a0b93ab5dd5eab5315e50eaa3ca620
>

Good to know.

> > You say no need to build your kernel with LTO...
> > That sounds good.
> > Currently, I build my kernels with Clang-14 and CONFIG_LTO_CLANG_THIN=y.
> > Does something speak against using CONFIG_LTO_CLANG_THIN=y with KCFI support?
> > Build-time?
> > Disc-usage?
>
> I wouldn't expect the build time or disk usage to significantly increase
> with kCFI. I ran a quick benchmark with Arch Linux's configuration with
> ThinLTO then ThinLTO + kCFI on an AMD EPYC 7513:
>
> Benchmark 1: ThinLTO
>   Time (abs ≡):        166.036 s               [User: 4687.951 s, System: 1636.767 s]
>
> Benchmark 2: ThinLTO + kCFI
>   Time (abs ≡):        168.739 s               [User: 4682.020 s, System: 1638.109 s]
>
> Summary
>   'ThinLTO' ran
>     1.02 times faster than 'ThinLTO + kCFI'
>

Thanks for your feedback.

-Sedat-



More information about the linux-arm-kernel mailing list