[PATCH] RISC-V: Don't check text_mutex during stop_machine

Palmer Dabbelt palmer at dabbelt.com
Sat May 22 12:32:05 PDT 2021


On Thu, 06 May 2021 06:25:50 PDT (-0700), rostedt at goodmis.org wrote:
> On Thu,  6 May 2021 00:10:41 -0700
> Palmer Dabbelt <palmer at dabbelt.com> wrote:
>
>> ---
>> In theory we should be able to avoid using stop_machine() with some
>> clever code sequences, but that's too big of a change to be considered a
>> fix.  I also can't find the text I thought was in the ISA manual about
>> the allowed behaviors for concurrent modification of the instruction
>> stream, so I might have just mis-remembered that.
>> ---
>
> I wonder if you could at least use break points, as some other archs do,
> and what x86 does.
>
> If you have this make believe machine code:
>
> 	00 00 00 00		nop
>
> And you want to turn it into a call.
>
> 	aa 12 34 56		bl ftrace_caller
>
> And if your architecture has a way to inject a break point on live code.
> Let's call this FF for the break point code.
>
> You can inject that first:
>
> 	FF 00 00 00

That would rely on this "concurrent writes to the instruction stream 
don't tear" property that I thought we'd put in the ISA but I can't 
actually find in the text of the document.  That said, I hadn't actually 
thought about replacing single bytes in the instruction stream.  In 
theory we don't have anything that guartees those won't tear, but I'm 
not sure how any reasonable implementation would go about manifesting 
those so it might be an easy retrofit to the ISA.

I ran through our breakpoint encodings and don't think those help, but 
we do have 0x00000013 as a NOP (addi x0, x0, 0) which is one byte away 
from 0x00000000 (which is defined to be an invalid instruction on all 
RISC-V systems).  There's really no difference between how a breakpoint 
traps and how an illegal instruction traps on RISC-V, so we should be 
able to use that to construct some sort of sequence.

So we can start with

    nop
    nop
    nop
    nop
    nop

then do a single-byte write to turn it into

    unimp
    nop
    nop
    nop
    nop

we can then IPI to all the harts in order to get them on the same page 
about that trap, which we can then skip over.  We'll need some way to 
differentiate this from accidental executions of unimp, but we can just 
build up a table somewhere (it wasn't immediately clear how x86 does 
this).  Then we have no ordering restrictions on converting the rest of 
the stub into what's necessary to trace a function, which should look 
the same as what we have now

    unimp
    save ra, ...(sp)
    auipc ra, ...
    jalr ra, ...
    load ra, ...(sp)

At which point we can IPI everywhere to sync the instructions stream 
again, before we turn off the trap

    nop
    save ra, ...(sp)
    auipc ra, ...
    jalr ra, ...
    load ra, ...(sp)

> sync all CPUs where now all callers will hit this and jump to the break
> point handler, which simply does:
>
> 	ip = ip + 4;
> 	return;
>
> and returns back to the instruction after this nop/call.

There'll be some slight headaches there because the length of the all-0 
unimplemented instruction depends on which extensions are supported, but 
we can detect ILEN at runtime so that shouldn't be a showstopper.

> Change the rest of the instruction.
>
> 	FF 12 34 56
>
> sync all CPUs so that they all see this new instruction, but are still
> triggering the break point.
>
> Then finally remove the break point.
>
> 	aa 12 34 56
>
> And you just switched from the nop to the call without using stop machine.

OK, ya I think that all should work.  We're still depending on this 
"single byte writes to the instruction stream don't tear" guarantee, but 
I think that's a bit pedantic.  I'll open a question on the ISA spec 
just to be sure one I can link here.

I'm probably not going to have the time to do this for a bit because I'm 
buried in this non-coherent stuff, but if someone wants to take a shot 
at it I'd be happy to take a look.

Thanks for pointing this out, this should get is away from 
stop_machine() way sooner than waiting for a multi-byte modification 
sequence would!



More information about the linux-riscv mailing list