[PATCH -next V6 1/7] riscv: ftrace: Fixup panic by disabling preemption

Mon Feb 6 18:31:38 PST 2023

On Mon, Jan 30, 2023 at 7:17 PM Mark Rutland <mark.rutland at arm.com> wrote:
>
> On Sat, Jan 28, 2023 at 06:00:20PM +0800, Guo Ren wrote:
> > On Thu, Jan 12, 2023 at 8:05 PM Mark Rutland <mark.rutland at arm.com> wrote:
>
> > > Ignoring things which require HW changes, you could consider doing something
> > > like what I'm doing for arm64 with DYNAMIC_FTRACE_WITH_CALL_OPS:
> > >
> > >   https://lore.kernel.org/lkml/20230109135828.879136-1-mark.rutland@arm.com/
> > The idea of DYNAMIC_FTRACE_WITH_CALL_OPS (Using data load/store +
> > indirect jump instead of auipc+jalr) is similar to Andy's solution
> > (See youtube link, last page of ppt).
>
> Sure; I was present in that room and I spoke with Andy at the time.
>
> The solutions are similar, but the important detail with
> DYNAMIC_FTRACE_WITH_CALL_OPS is that the load and indirect branch is moved into
> a common trampoline so that each call-site can be smaller. The ops pointer is
> placed *before* the function entry point and doesn't need to be skipped with a
> direct branch (which Andy's approach could also do if he aligned functions
> similarly).
>
> > But the key problem is you also expand the size of the prologue of the
> > function. 64BIT is already expensive, and we can't afford more of it. I would
> > change to seek a new atomic auipc+jalr ISA extension to solve this problem.
>
> Sure, and that's nice for *new* hardware, but I'm talking about a solution
> which works on *current* hardware.
>
> > DYNAMIC_FTRACE_WITH_CALL_OPS  would speed up ftrace_(regs)_caller (Mostly for
> > kernel debug), but it won't help DYNAMIC_FTRACE_WITH_DIRECT_CALLS. So I do
> > not so care about the ftrace_(regs)_caller performance gain.
>
> Actually, the plan is that it *will* help DYNAMIC_FTRACE_WITH_DIRECT_CALLS; we
> just didn't make all the necessary changes in one go.
>
> Florent Revest is looking at implementing that by placing the direct call
> pointer into the ops, so the common trampoline can load that directly.
>
> He has an older draft available at:
>
>   https://github.com/FlorentRevest/linux/commits/indirect-direct-calls-3
Thx for sharing :)

>
> ... and since then, having spoken to Steven, we came up with a plan to make all
> direct calls require an ops (which is the case for DIRECT_CALLS_MULTI), and
> place a trampoline pointer in the ops.
>
> That way, the common trampoline can do something like (in arm64 asm):
>
> |       LDR     <tmp>, [<ops>, #OPS_TRAMP_PTR]
> |       CBNZ    <tmp>, __call_tramp_directly
> |
> |       // ... regular regs trampoline logic here
> |
> | __call_tramp_directly:
> |
> |       // ... shuffle registers here
> |
> |       BR      <tmp>
>
> ... and I believe the same should work for riscv.
I agree; I would try next.

>
> Thanks,
> Mark.


-- 
Best Regards
 Guo Ren