[PATCH] arm64: asmlinkage: Enable use of BTI_C macro in SYM_CODE

Mark Rutland mark.rutland at arm.com
Mon Dec 6 10:38:46 PST 2021


On Mon, Dec 06, 2021 at 06:05:44PM +0000, Mark Brown wrote:
> On Mon, Dec 06, 2021 at 04:30:41PM +0000, Mark Rutland wrote:
> 
> > Looking around, there are other places that open-code `hint 34`, e.g.
> > arch/arm64/crypto/aes-modes.S. Those are unconditional, so we should probably
> > figure out whether we want those to be conditional (or if we're happy to make
> > the other cases similarly unconditional).
> 
> It seems to *just* be aes-modes.S AFAICT and that does rely on having
> the BTIs actually emitted (well, and kselftest but that can't use any
> kernel internal helpers so it's not relevant here).
> 
> > I'd argue we should probably place BTIs in assembly unconditionally, on the
> > assumption that they shouldn't have an measureable performance impact in
> > practice (as we're already assuming that when CONFIG_ARM64_BTI_KERNEL is
> > selected anyhow). Thoughts?
> 
> I'm not sure what you mean here?  We do already add BTIs unconditionally
> to SYM_FUNC when BTI is enabled for the kernel, it's only SYM_CODE where
> we don't. 

I mean that regardless of whether BTI is enabled for the kernel, all
SYM_FUNC_*() code should get an actual BTI instruction, and if we have a macro
(be it `BTI_C`, `bti`, or otherwise), that should output an actual BTI
instruction. Where explicitly use the macro in SYM_CODE_*() code, that would
have an actual BTI, but we wouldn't insert it automatically into SYM_CODE_*().

> If you mean that the macro should unconditionally emit BTI when the
> macro is used then the main reason I didn't do that was for consistency
> with C code - that will only have BTIs added if we're building with BTI
> enabled.  There is at least one implementation out there which implements
> unknown HINTs with a performance overhead compared to NOPs so that is
> one of the reasons why the kernel might be built with BTI disabled.

I appreciate that rationale; I'm saying we should re-evaluate that.

I expect distros will (eventually) enable BTI for usersapce (where perf impact
likely matters more), defconfig has BTI enabled (so people will be using that
by default), and the general expectation is that HINT space instructions should
be fast when they're NOPs. I suspect that even if a particular CPU has
expensive HINTs, since asm functions are called more rarely than C functions,
it's likely in the noise anyway.

If this does actually matter in practice, then I'm happy to make it
conditional, I just think we should err on the side of simplicity otherwise.
Especially as we already have on case where we add the instructions
unconditionally, in what is arguably fast-path code!

> > Regardless, I reckon we should probably define a `bti` macro centrally in
> > arch/arm64/include/asm/assembler.h, something like:
> 
> > That seems to do the right thing:
> 
> Yes, even seems to do the right thing without complaint if the assembler
> is configured v8.5 and so understands v8.5.  My main worry would be the
> potential confusion caused when people see what looks like it should be
> a v8.5 instruction in code which is supposed to build configured for
> v8.0, it looks wrong to see something that looks like a more modern
> instruction in code that should be buildable with a toolchain that
> doesn't understand it.

Sure, but that's already the case for SB and ESB, so I'm not sure that matters.

> > Can we simplify the ifdeffery so that we only conditionally define BTI_C, and
> > unconditionally define the sites using it? That way if there's a problem with
> > any of the users that will consistently show up in testing, regardless of
> > whether CONFIG_ARM64_BTI_KERNEL is selected.
> > 
> > e.g. we can make this:
> > 
> > | #ifdef CONFIG_ARM64_BTI_KERNEL
> > | #define BTI_C	hint #34 ;
> > | #else
> > | #define BTI_C	
> > | #endif
> 
> Should work.

Cool.

Thanks,
Mark.



More information about the linux-arm-kernel mailing list