[PATCH v3 0/7] riscv: ftrace: atmoic patching and preempt improvements

Evgenii Shatokhin e.shatokhin at yadro.com
Sun Dec 1 23:58:51 PST 2024


Hi,

On 27.11.2024 20:29, Andy Chiu wrote:
> This series makes atmoic code patching possible in riscv ftrace. A
> direct benefit of this is that we can get rid of stop_machine() when
> patching function entries. This also makes it possible to run ftrace
> with full kernel preemption. Before this series, the kernel initializes
> patchable function entries to NOP4 + NOP4. To start tracing, it updates
> entries to AUIPC + JALR while holding other cores in stop_machine.
> stop_machine() is required because it is impossible to update 2
> instructions, and be seen atomically. And preemption must have to be
> prevented, as kernel preemption allows process to be scheduled out while
> executing on one of these instruction pairs.
> 
> This series addresses the problem by initializing the first NOP4 to
> AUIPC. So, atmoic patching is possible because the kernel only has to
> update one instruction. As long as the instruction is naturally aligned,
> then it is expected to be updated atomically.
> 
> However, the address range of the ftrace trampoline is limited to +-2K
> from ftrace_caller after appplying this series. This issue is expected
> to be solved by Puranjay's CALL_OPS, where it adds 8B naturally align
> data in front of pacthable functions and can  use it to direct execution
> out to any custom trampolines.
> 
> The series is composed by three parts. The first part cleans up the
> existing issues when the kernel is compiled with clang.The second part
> modifies the ftrace code patching mechanism (2-4) as mentioned above.
> Then prepare ftrace to be able to run with kernel preemption (5,6)
> 
> An ongoing fix:
> 
> Since there is no room for marking *kernel_text_address as notrace[1] at
> source code level, there is a significant performance regression when
> using function_graph with TRACE_IRQFLAGS enabled. There can be as much as
> 8 graph handler being called in each function-entry. The current
> workaround requires us echo "*kernel_text_address" into
> set_ftrace_notrace before starting the trace. However, we observed that
> the kernel still enables the patch site in some cases even with
> *kernel_text_address properly added in the file While the root cause is
> still under investagtion, we consider that it should not be the reason
> for holding back the code patching, in order to unblock the call_ops
> part.
> 
> [1]: https://lore.kernel.org/linux-riscv/20240613093233.0b349ed0@rorschach.local.home/
> 
> Changes in v3:
> - Add a fix tag for patch 1
> - Add a data fence before sending out remote fence.i (6)
> - Link to v2: https://lore.kernel.org/all/20240628-dev-andyc-dyn-ftrace-v4-v2-0-1e5f4cb1f049@sifive.com/
> 
> Changes in v2:
> - Drop patch 1 as it is merged through fixes.
> - Drop patch 2, which converts kernel_text_address into notrace. As
>    users can prevent tracing it by configuring the tracefs.
> - Use a more generic way in kconfig to align functions.
> - Link to v1: https://lore.kernel.org/r/20240613-dev-andyc-dyn-ftrace-v4-v1-0-1a538e12c01e@sifive.com
> 
> 
> Andy Chiu (7):
>    riscv: ftrace: support fastcc in Clang for WITH_ARGS
>    riscv: ftrace: align patchable functions to 4 Byte boundary
>    riscv: ftrace: prepare ftrace for atomic code patching
>    riscv: ftrace: do not use stop_machine to update code
>    riscv: vector: Support calling schedule() for preemptible Vector
>    riscv: add a data fence for CMODX in the kernel mode
>    riscv: ftrace: support PREEMPT
> 
>   arch/riscv/Kconfig                 |   4 +-
>   arch/riscv/include/asm/ftrace.h    |  11 +++
>   arch/riscv/include/asm/processor.h |   5 ++
>   arch/riscv/include/asm/vector.h    |  22 ++++-
>   arch/riscv/kernel/asm-offsets.c    |   7 ++
>   arch/riscv/kernel/ftrace.c         | 133 ++++++++++++-----------------
>   arch/riscv/kernel/mcount-dyn.S     |  25 ++++--
>   arch/riscv/mm/cacheflush.c         |  15 +++-
>   8 files changed, 135 insertions(+), 87 deletions(-)
> ---
> base-commit: 0eb512779d642b21ced83778287a0f7a3ca8f2a1
> --
> 2.39.3 (Apple Git-145)

I have tested this series in a QEMU VM (-machine virt) with the 
preemptible kernels, CONFIG_PREEMPT=y.

No issues have been revealed so far.

One of the kernels was built by GCC 13.2.0 (with the patch for minimum 
alignment added on top of it), the other - with LLVM 17.0.6.

In both cases, the basic boottime tests for Ftrace passed.

Switching tracers between nop, function, function_graph and blk in a 
loop in parallel with stress-ng and with network load for several hours 
did not reveal any problems either. Kernel crashes happened in this 
scenario a year ago, but now it runs clean, good!

Function redirection via Ftrace seems to work OK too.

The size of .text section increased slightly after this series, by 0.35% 
- 0.38%, probably because of function alignment:
* 12075 KB => 12121 KB (GCC)
* 12167 KB => 12209 KB (LLVM/clang)

Not sure, how to test the vector-related part though, "[PATCH v3 5/7] 
riscv: vector: Support calling schedule() for preemptible Vector"

For all other patches in the series:
Tested-by: Evgenii Shatokhin <e.shatokhin at yadro.com>

Regards,
Evgenii



More information about the linux-riscv mailing list