[RFC PATCH v2 20/21] x86: Add support for CONFIG_CFI_CLANG

Peter Zijlstra peterz at infradead.org
Mon May 16 04:45:17 PDT 2022


On Mon, May 16, 2022 at 11:54:33AM +0200, Peter Zijlstra wrote:
> On Fri, May 13, 2022 at 01:21:58PM -0700, Sami Tolvanen wrote:
> > With CONFIG_CFI_CLANG, the compiler injects a type preamble
> > immediately before each function and a check to validate the target
> > function type before indirect calls:
> > 
> >   ; type preamble
> >   __cfi_function:
> >     int3
> >     int3
> >     mov <id>, %eax
> >     int3
> >     int3
> >   function:
> >     ...
> 
> When I enable CFI_CLANG and X86_KERNEL_IBT I get:
> 
> 0000000000000c80 <__cfi_io_schedule_timeout>:
> c80:   cc                      int3
> c81:   cc                      int3
> c82:   b8 b5 b1 39 b3          mov    $0xb339b1b5,%eax
> c87:   cc                      int3
> c88:   cc                      int3
> 
> 0000000000000c89 <io_schedule_timeout>:
> c89:   f3 0f 1e fa             endbr64
> 
> 
> That seems unfortunate. Would it be possible to get an additional
> compiler option to suppress the endbr for all symbols that get a __cfi_
> preaamble?
> 
> Also, perhaps s/CFI_CLANG/KERNEL_CFI/ or somesuch, so that GCC might
> also implement this same scheme (in time)?
> 
> >   ; indirect call check
> >     cmpl    <id>, -6(%r11)
> >     je      .Ltmp1
> >     ud2
> >   .Ltmp1:
> >     call    __x86_indirect_thunk_r11
> 
> The first one I try and find looks like:
> 
> 26:       41 81 7b fa a6 96 9e 38         cmpl   $0x389e96a6,-0x6(%r11)
> 2e:       74 02                   je     32 <__traceiter_sched_kthread_stop+0x29>
> 30:       0f 0b                   ud2
> 32:       4c 89 f6                mov    %r14,%rsi
> 35:       e8 00 00 00 00          call   3a <__traceiter_sched_kthread_stop+0x31> 36: R_X86_64_PLT32      __x86_indirect_thunk_r11-0x4
> 
> This must not be. If I'm to rewrite that lot to:
> 
>   movl	$\hash, %r10d
>   sub	$9, %r11
>   call	*%r11
>   .nop  4
> 
> Then there must not be spurious instruction in between the ud2 and the
> indirect call/retpoline thing.

Hmmm.. when I replace it with:

   movl	$\hash, %r10d
   sub	$9, %r11
   .nops 2

That would work, that has the added benefit of nicely co-existing with
the current retpoline patching.

The only remaining problem is how to find this; the .retpoline_sites is
fairly concenient, but if the compiler can put arbitrary amounts of code
in between this is going to be somewhat tedious.




More information about the linux-arm-kernel mailing list