[PATCH -next V7 0/7] riscv: Optimize function trace

David Laight David.Laight at ACULAB.COM
Wed Feb 8 14:29:41 PST 2023


> >   # Note: aligned to 8 bytes
> >   addr-08               // Literal (first 32-bits)      // patched to ops ptr
> >   addr-04               // Literal (last 32-bits)       // patched to ops ptr
> >   addr+00       func:   mv      t0, ra
> We needn't "mv t0, ra" here because our "jalr" could work with t0 and
> won't affect ra. Let's do it in the trampoline code, and then we can
> save another word here.
> >   addr+04               auipc   t1, ftrace_caller
> >   addr+08               jalr    ftrace_caller(t1)

Is that some kind of 'load high' and 'add offset' pair?
I guess 64bit kernels guarantee to put all module code
within +-2G of the main kernel? 

> Here is the call-site:
>    # Note: aligned to 8 bytes
>    addr-08               // Literal (first 32-bits)      // patched to ops ptr
>    addr-04               // Literal (last 32-bits)       // patched to ops ptr
>    addr+00               auipc   t0, ftrace_caller
>    addr+04               jalr    ftrace_caller(t0)

Could you even do something like:
	addr-n	call ftrace-function
	addr-n+x	literals
	addr+0	nop or jmp addr-n
	addr+4	function_code
So that all the code executed when tracing is enabled
is before the label and only one 'nop' is in the body.
The called code can use the return address to find the
literals and then modify it to return to addr+4.
The code cost when trace is enabled is probably irrelevant
here - dominated by what happens later.
It probably isn't even worth aligning a 64bit constant.
Doing two reads probably won't be noticable.

What you do want to ensure is that the initial patch is
overwriting nop - just in case the gap isn't there.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


More information about the linux-riscv mailing list