[RFC PATCH v2 20/21] x86: Add support for CONFIG_CFI_CLANG
Sami Tolvanen
samitolvanen at google.com
Mon May 16 12:39:19 PDT 2022
On Mon, May 16, 2022 at 11:30 AM Peter Zijlstra <peterz at infradead.org> wrote:
>
> On Mon, May 16, 2022 at 10:15:00AM -0700, Sami Tolvanen wrote:
> > On Mon, May 16, 2022 at 2:54 AM Peter Zijlstra <peterz at infradead.org> wrote:
> > >
> > > On Fri, May 13, 2022 at 01:21:58PM -0700, Sami Tolvanen wrote:
> > > > With CONFIG_CFI_CLANG, the compiler injects a type preamble
> > > > immediately before each function and a check to validate the target
> > > > function type before indirect calls:
> > > >
> > > > ; type preamble
> > > > __cfi_function:
> > > > int3
> > > > int3
> > > > mov <id>, %eax
> > > > int3
> > > > int3
> > > > function:
> > > > ...
> > >
> > > When I enable CFI_CLANG and X86_KERNEL_IBT I get:
> > >
> > > 0000000000000c80 <__cfi_io_schedule_timeout>:
> > > c80: cc int3
> > > c81: cc int3
> > > c82: b8 b5 b1 39 b3 mov $0xb339b1b5,%eax
> > > c87: cc int3
> > > c88: cc int3
> > >
> > > 0000000000000c89 <io_schedule_timeout>:
> > > c89: f3 0f 1e fa endbr64
> > >
> > >
> > > That seems unfortunate. Would it be possible to get an additional
> > > compiler option to suppress the endbr for all symbols that get a __cfi_
> > > preaamble?
> >
> > What's the concern with the endbr? Dropping it would currently break
> > the CFI+IBT combination on newer hardware, no?
>
> Well, yes, but also that combination isn't very interesting. See,
>
> https://lore.kernel.org/all/20220420004241.2093-1-joao@overdrivepizza.com/T/#m5d67fb010d488b2f8eee33f1eb39d12f769e4ad2
>
> and the patch I did down-thread:
>
> https://lkml.kernel.org/r/YoJKhHluN4n0kZDm@hirez.programming.kicks-ass.net
>
> If we have IBT, then FineIBT is a much better option than kCFI+IBT.
> Removing that superfluous endbr also shrinks the whole thing by 4 bytes.
>
> So I'm fine with the compiler generating working code for that
> combination; but please get me an option to supress it in order to save
> those pointless bytes. All this CFI stuff is enough bloat as it is.
Sure, I'll take a look at what's the best way to accomplish this.
> > > > ; indirect call check
> > > > cmpl <id>, -6(%r11)
> > > > je .Ltmp1
> > > > ud2
> > > > .Ltmp1:
> > > > call __x86_indirect_thunk_r11
> > >
> > > The first one I try and find looks like:
> > >
> > > 26: 41 81 7b fa a6 96 9e 38 cmpl $0x389e96a6,-0x6(%r11)
> > > 2e: 74 02 je 32 <__traceiter_sched_kthread_stop+0x29>
> > > 30: 0f 0b ud2
> > > 32: 4c 89 f6 mov %r14,%rsi
> > > 35: e8 00 00 00 00 call 3a <__traceiter_sched_kthread_stop+0x31> 36: R_X86_64_PLT32 __x86_indirect_thunk_r11-0x4
> > >
> > > This must not be. If I'm to rewrite that lot to:
> > >
> > > movl $\hash, %r10d
> > > sub $9, %r11
> > > call *%r11
> > > .nop 4
> > >
> > > Then there must not be spurious instruction in between the ud2 and the
> > > indirect call/retpoline thing.
> >
> > With the current compiler patch, LLVM sets up function arguments after
> > the CFI check. if it's a problem, we can look into changing that.
>
> Yes, please fix that. Again see that same patch for why this is a
> problem. Objtool can trivially find retpoline calls, but finding this
> kCFI gadget is going to be hard work. If you ensure they're
> unconditionally stuck together, then the problem goes away find one,
> finds the other.
You can use .kcfi_traps to locate the check right now, but I agree,
it's not quite ideal.
Sami
More information about the linux-arm-kernel
mailing list