[PATCH bpf-next v3 2/4] libbpf: Access first syscall argument with CO-RE direct read on arm64

Xu Kuohai xukuohai at huaweicloud.com
Wed Sep 4 19:39:07 PDT 2024


On 9/5/2024 4:06 AM, Andrii Nakryiko wrote:
> On Sat, Aug 31, 2024 at 12:57 AM Xu Kuohai <xukuohai at huaweicloud.com> wrote:
>>
>> On 8/31/2024 3:26 PM, Xu Kuohai wrote:
>>> On 8/31/2024 12:19 PM, Pu Lehui wrote:
>>>> From: Pu Lehui <pulehui at huawei.com>
>>>>
>>>> Currently PT_REGS_PARM1 SYSCALL(x) is consistent with PT_REGS_PARM1_CORE
>>>> SYSCALL(x), which will introduce the overhead of BPF_CORE_READ(), taking
>>>> into account the read pt_regs comes directly from the context, let's use
>>>> CO-RE direct read to access the first system call argument.
>>>>
>>>> Suggested-by: Andrii Nakryiko <andrii at kernel.org>
>>>> Signed-off-by: Pu Lehui <pulehui at huawei.com>
>>>> ---
>>>>    tools/lib/bpf/bpf_tracing.h | 4 ++--
>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
>>>> index e7d9382efeb3..051c408e6aed 100644
>>>> --- a/tools/lib/bpf/bpf_tracing.h
>>>> +++ b/tools/lib/bpf/bpf_tracing.h
>>>> @@ -222,7 +222,7 @@ struct pt_regs___s390 {
>>>>    struct pt_regs___arm64 {
>>>>        unsigned long orig_x0;
>>>> -};
>>>> +} __attribute__((preserve_access_index));
>>>>    /* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
>>>>    #define __PT_REGS_CAST(x) ((const struct user_pt_regs *)(x))
>>>> @@ -241,7 +241,7 @@ struct pt_regs___arm64 {
>>>>    #define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG
>>>>    #define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG
>>>>    #define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG
>>>> -#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x)
>>>> +#define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___arm64 *)(x))->orig_x0)
>>>>    #define PT_REGS_PARM1_CORE_SYSCALL(x) \
>>>>        BPF_CORE_READ((const struct pt_regs___arm64 *)(x), __PT_PARM1_SYSCALL_REG)
>>>
>>> Cool!
>>>
>>> Acked-by: Xu Kuohai <xukuohai at huawei.com>
>>>
>>>
>>
>> Wait, it breaks the following test:
>>
> 
> You mean, *if you change the existing test like below*, it will break,
> right? And that's expected, because arm64 has
> ARCH_HAS_SYSCALL_WRAPPER, which means syscall pt_regs are actually not
> the kprobe's ctx, so you can't directly access it. Which is why we
> have PT_REGS_PARM1_CORE_SYSCALL() variants.
> 
> See how BPF_KSYSCALL macro is implemented, there are two cases:
> ___bpf_syswap_args(), which uses BPF_CORE_READ()-based macros to fetch
> arguments, and ___bpf_syscall_args() which uses direct ctx reads.
>

Got it, thanks for the explanation.

> 
>> --- a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
>> +++ b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
>> @@ -43,7 +43,7 @@ int BPF_KPROBE(handle_sys_prctl)
>>
>>           /* test for PT_REGS_PARM */
>>
>> -       bpf_probe_read_kernel(&tmp, sizeof(tmp), &PT_REGS_PARM1_SYSCALL(real_regs));
>> +       tmp = PT_REGS_PARM1_SYSCALL(real_regs);
>>           arg1 = tmp;
>>           bpf_probe_read_kernel(&arg2, sizeof(arg2), &PT_REGS_PARM2_SYSCALL(real_regs));
>>           bpf_probe_read_kernel(&arg3, sizeof(arg3), &PT_REGS_PARM3_SYSCALL(real_regs));
>>
>> Failed with verifier rejection:
>>
>> 0: R1=ctx() R10=fp0
>> ; int BPF_KPROBE(handle_sys_prctl) @ bpf_syscall_macro.c:33
>> 0: (bf) r6 = r1                       ; R1=ctx() R6_w=ctx()
>> ; pid_t pid = bpf_get_current_pid_tgid() >> 32; @ bpf_syscall_macro.c:36
>> 1: (85) call bpf_get_current_pid_tgid#14      ; R0_w=scalar()
>> ; if (pid != filter_pid) @ bpf_syscall_macro.c:39
>> 2: (18) r1 = 0xffff800082e0e000       ; R1_w=map_value(map=bpf_sysc.rodata,ks=4,vs=4)
>> 4: (61) r1 = *(u32 *)(r1 +0)          ; R1_w=607
>> ; pid_t pid = bpf_get_current_pid_tgid() >> 32; @ bpf_syscall_macro.c:36
>> 5: (77) r0 >>= 32                     ; R0_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
>> ; if (pid != filter_pid) @ bpf_syscall_macro.c:39
>> 6: (5e) if w1 != w0 goto pc+98        ; R0_w=607 R1_w=607
>> ; real_regs = PT_REGS_SYSCALL_REGS(ctx); @ bpf_syscall_macro.c:42
>> 7: (79) r8 = *(u64 *)(r6 +0)          ; R6_w=ctx() R8_w=scalar()
>> ; tmp = PT_REGS_PARM1_SYSCALL(real_regs); @ bpf_syscall_macro.c:46
>> 8: (79) r1 = *(u64 *)(r8 +272)
>> R8 invalid mem access 'scalar'
>> processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>>




More information about the linux-riscv mailing list