[GIT PULL] RISC-V updates for v7.0
Deepak Gupta
debug at rivosinc.com
Thu Feb 26 13:04:19 PST 2026
Hi Peter,
Responses inline.
On Thu, Feb 26, 2026 at 02:23:42PM +0100, Peter Zijlstra wrote:
>On Wed, Feb 18, 2026 at 05:57:45PM -0800, Deepak Gupta wrote:
>
>> x86 doesn't have any equivalent BTI bit in PTEs to mark code pages. IIRC, it
>> does have mechanism where a bitmap has to be prepared and each entry in bitmap
>> encodes whether a page is legacy code page (without `endbr64`) or a modern code
>> page (with `endbr64`). And CPU will consult this bitmap to suppress the fault.
>
>So; all of this is only ever relevant for programs that are mixing CFI
>and !CFI code. If a program has no CFI, all good. If a program is all
>CFI enabled, also all good.
>
>If it starts mixing things, then you get to be 'creative'.
>
>Now the thing is, if you start to do that you need to deal with both
>forward CFI (BTI) and backward CFI (shadow-stack) #CF exceptions. That
>bitmap, that can only deal with BTI, but doesn't help with shadow
>stack, so its useless.
>
>My proposal was to ignore that whole bitmap; that's dead hardware, never
>used. Instead use a software PTE bit, like ARM has, and simply eat the
>#CF look at PTE and figure out what to do.
IIRC, arm has hardware PTE bit saying this is a guarded page. That can be kept
in ITLB as part of virt addr translation during instruction fetch. So whenever
indir_call --> target happens, if target translation was already in ITLB, CPU
already knows whether to suppress the fault or not, without going to kernel.
In x86 case, using a software PTE bit would be different. There will be a fault
always and kernel won't be able to make a decision on what to do. It'll need
some delegating authority to make that decision. That delegating authority can
be a signal handler in userspace which may need a bitmap/auxilliary data
structure of sort to make that decision whether target address is a taken target
or should not be taken.
So decision point is either
- do a software bitmap or
- hardware bitmap (legacy interworking bitmap)
(both will be slow).
OR
Just don't allow/support that configuration to enable CFI. And put onus on
workload owner to do the work to enable the feature.
Sidenote: I wish we were able to convince someone certain in Redmond to give a
sw bit back and this all would have been nicer. Given there wasn't a lot of
traction from open source for this feature, it was mostly a redmond driven
feature.
>
>Yes, this is 'slow', but my claim is that this doesn't matter. There are
>2 ways out of this slow-ness:
>
> - fully disable CFI for your program (probably not the thing you want,
> but a quick fix, and not really less secure than partial CFI anyway).
>
> - fully enable CFI for your program (might be a bit of work).
>
>The whole mixed thing is a transition state where userspace doesn't have
>its ducks in a row. It will go away.
I have spent 8 years defining features to kill class of low-level exploits back
at Intel. And then next 6 years in places where software is deployed on these
CPUs.
I am a security engineer and would have loved to get these features enabled.
But in all honesty, I am yet to see anyone at these places (hyperscalars)
willing to give up an ounce of perf budget (1-2% demands discussion and strong
justification) for enabling just the shadow stack feature.
So my advise would be not to care about enabling path where there is a perf hit.
Keep it simple
- Enable when all binaries have feature awareness.
- Disable when there is one binary with no feature awareness.
>
>
More information about the linux-riscv
mailing list