[RFC PATCH v2 20/21] x86: Add support for CONFIG_CFI_CLANG

Sami Tolvanen samitolvanen at google.com
Mon May 16 12:39:19 PDT 2022


On Mon, May 16, 2022 at 11:30 AM Peter Zijlstra <peterz at infradead.org> wrote:
>
> On Mon, May 16, 2022 at 10:15:00AM -0700, Sami Tolvanen wrote:
> > On Mon, May 16, 2022 at 2:54 AM Peter Zijlstra <peterz at infradead.org> wrote:
> > >
> > > On Fri, May 13, 2022 at 01:21:58PM -0700, Sami Tolvanen wrote:
> > > > With CONFIG_CFI_CLANG, the compiler injects a type preamble
> > > > immediately before each function and a check to validate the target
> > > > function type before indirect calls:
> > > >
> > > >   ; type preamble
> > > >   __cfi_function:
> > > >     int3
> > > >     int3
> > > >     mov <id>, %eax
> > > >     int3
> > > >     int3
> > > >   function:
> > > >     ...
> > >
> > > When I enable CFI_CLANG and X86_KERNEL_IBT I get:
> > >
> > > 0000000000000c80 <__cfi_io_schedule_timeout>:
> > > c80:   cc                      int3
> > > c81:   cc                      int3
> > > c82:   b8 b5 b1 39 b3          mov    $0xb339b1b5,%eax
> > > c87:   cc                      int3
> > > c88:   cc                      int3
> > >
> > > 0000000000000c89 <io_schedule_timeout>:
> > > c89:   f3 0f 1e fa             endbr64
> > >
> > >
> > > That seems unfortunate. Would it be possible to get an additional
> > > compiler option to suppress the endbr for all symbols that get a __cfi_
> > > preaamble?
> >
> > What's the concern with the endbr? Dropping it would currently break
> > the CFI+IBT combination on newer hardware, no?
>
> Well, yes, but also that combination isn't very interesting. See,
>
>   https://lore.kernel.org/all/20220420004241.2093-1-joao@overdrivepizza.com/T/#m5d67fb010d488b2f8eee33f1eb39d12f769e4ad2
>
> and the patch I did down-thread:
>
>   https://lkml.kernel.org/r/YoJKhHluN4n0kZDm@hirez.programming.kicks-ass.net
>
> If we have IBT, then FineIBT is a much better option than kCFI+IBT.
> Removing that superfluous endbr also shrinks the whole thing by 4 bytes.
>
> So I'm fine with the compiler generating working code for that
> combination; but please get me an option to supress it in order to save
> those pointless bytes. All this CFI stuff is enough bloat as it is.

Sure, I'll take a look at what's the best way to accomplish this.

> > > >   ; indirect call check
> > > >     cmpl    <id>, -6(%r11)
> > > >     je      .Ltmp1
> > > >     ud2
> > > >   .Ltmp1:
> > > >     call    __x86_indirect_thunk_r11
> > >
> > > The first one I try and find looks like:
> > >
> > > 26:       41 81 7b fa a6 96 9e 38         cmpl   $0x389e96a6,-0x6(%r11)
> > > 2e:       74 02                   je     32 <__traceiter_sched_kthread_stop+0x29>
> > > 30:       0f 0b                   ud2
> > > 32:       4c 89 f6                mov    %r14,%rsi
> > > 35:       e8 00 00 00 00          call   3a <__traceiter_sched_kthread_stop+0x31> 36: R_X86_64_PLT32      __x86_indirect_thunk_r11-0x4
> > >
> > > This must not be. If I'm to rewrite that lot to:
> > >
> > >   movl  $\hash, %r10d
> > >   sub   $9, %r11
> > >   call  *%r11
> > >   .nop  4
> > >
> > > Then there must not be spurious instruction in between the ud2 and the
> > > indirect call/retpoline thing.
> >
> > With the current compiler patch, LLVM sets up function arguments after
> > the CFI check. if it's a problem, we can look into changing that.
>
> Yes, please fix that. Again see that same patch for why this is a
> problem. Objtool can trivially find retpoline calls, but finding this
> kCFI gadget is going to be hard work. If you ensure they're
> unconditionally stuck together, then the problem goes away find one,
> finds the other.

You can use .kcfi_traps to locate the check right now, but I agree,
it's not quite ideal.

Sami



More information about the linux-arm-kernel mailing list