[PATCH -next V7 1/7] riscv: ftrace: Fixup panic by disabling preemption

Guo Ren guoren at kernel.org
Sat Jan 28 01:37:46 PST 2023


On Thu, Jan 12, 2023 at 8:16 PM Mark Rutland <mark.rutland at arm.com> wrote:
>
> Hi Guo,
>
> On Thu, Jan 12, 2023 at 04:05:57AM -0500, guoren at kernel.org wrote:
> > From: Andy Chiu <andy.chiu at sifive.com>
> >
> > In RISCV, we must use an AUIPC + JALR pair to encode an immediate,
> > forming a jump that jumps to an address over 4K. This may cause errors
> > if we want to enable kernel preemption and remove dependency from
> > patching code with stop_machine(). For example, if a task was switched
> > out on auipc. And, if we changed the ftrace function before it was
> > switched back, then it would jump to an address that has updated 11:0
> > bits mixing with previous XLEN:12 part.
> >
> > p: patched area performed by dynamic ftrace
> > ftrace_prologue:
> > p|      REG_S   ra, -SZREG(sp)
> > p|      auipc   ra, 0x? ------------> preempted
> >                                       ...
> >                               change ftrace function
> >                                       ...
> > p|      jalr    -?(ra) <------------- switched back
> > p|      REG_L   ra, -SZREG(sp)
> > func:
> >       xxx
> >       ret
>
> As mentioned on the last posting, I don't think this is sufficient to fix the
> issue. I've replied with more detail there:
>
>   https://lore.kernel.org/lkml/Y7%2F3hoFjS49yy52W@FVFF77S0Q05N/
>
> Even in a non-preemptible SMP kernel, if one CPU can be in the middle of
> executing the ftrace_prologue while another CPU is patching the
> ftrace_prologue, you have the exact same issue.
>
> For example, if CPU X is in the prologue fetches the old AUIPC and the new
> JALR (because it races with CPU Y modifying those), CPU X will branch to the
> wrong address. The race window is much smaller in the absence of preemption,
> but it's still there (and will be exacerbated in virtual machines since the
> hypervisor can preempt a vCPU at any time).
>
> Note that the above is even assuming that instruction fetches are atomic, which
> I'm not sure is the case; for example arm64 has special CMODX / "Concurrent
> MODification and eXecutuion of instructions" rules which mean only certain
> instructions can be patched atomically.
>
> Either I'm missing something that provides mutual exclusion between the
> patching and execution of the ftrace_prologue, or this patch is not sufficient.
This patch is sufficient because riscv isn't the same as arm64. It
uses default arch_ftrace_update_code, which uses stop_machine.
See kernel/trace/ftrace.c:
void __weak arch_ftrace_update_code(int command)
{
        ftrace_run_stop_machine(command);
}

ps:
 Yes, it's not good, and it's expensive.

>
> Thanks,
> Mark.
>
> > Fixes: afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")
> > Signed-off-by: Andy Chiu <andy.chiu at sifive.com>
> > Signed-off-by: Guo Ren <guoren at kernel.org>
> > ---
> >  arch/riscv/Kconfig | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index e2b656043abf..ee0d39b26794 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -138,7 +138,7 @@ config RISCV
> >       select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
> >       select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
> >       select HAVE_FUNCTION_GRAPH_TRACER
> > -     select HAVE_FUNCTION_TRACER if !XIP_KERNEL
> > +     select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION
> >
> >  config ARCH_MMAP_RND_BITS_MIN
> >       default 18 if 64BIT
> > --
> > 2.36.1
> >



-- 
Best Regards
 Guo Ren



More information about the linux-riscv mailing list