[bpf-next v6 4/5] bpf, x86: Emit ENDBR for indirect jump targets

Xu Kuohai xukuohai at huaweicloud.com
Fri Mar 6 19:56:06 PST 2026


On 3/7/2026 11:31 AM, Alexei Starovoitov wrote:
> On Fri, Mar 6, 2026 at 7:15 PM Xu Kuohai <xukuohai at huaweicloud.com> wrote:
>>
>> On 3/7/2026 9:36 AM, Eduard Zingerman wrote:
>>> On Fri, 2026-03-06 at 18:23 +0800, Xu Kuohai wrote:
>>>> From: Xu Kuohai <xukuohai at huawei.com>
>>>>
>>>> On CPUs that support CET/IBT, the indirect jump selftest triggers
>>>> a kernel panic because the indirect jump targets lack ENDBR
>>>> instructions.
>>>>
>>>> To fix it, emit an ENDBR instruction to each indirect jump target. Since
>>>> the ENDBR instruction shifts the position of original jited instructions,
>>>> fix the instruction address calculation wherever the addresses are used.
>>>>
>>>> For reference, below is a sample panic log.
>>>>
>>>>    Missing ENDBR: bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
>>>>    ------------[ cut here ]------------
>>>>    kernel BUG at arch/x86/kernel/cet.c:133!
>>>>    Oops: invalid opcode: 0000 [#1] SMP NOPTI
>>>>
>>>>    ...
>>>>
>>>>     ? 0xffffffffc00fb258
>>>>     ? bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
>>>>     bpf_prog_test_run_syscall+0x110/0x2f0
>>>>     ? fdget+0xba/0xe0
>>>>     __sys_bpf+0xe4b/0x2590
>>>>     ? __kmalloc_node_track_caller_noprof+0x1c7/0x680
>>>>     ? bpf_prog_test_run_syscall+0x215/0x2f0
>>>>     __x64_sys_bpf+0x21/0x30
>>>>     do_syscall_64+0x85/0x620
>>>>     ? bpf_prog_test_run_syscall+0x1e2/0x2f0
>>>>
>>>> Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
>>>> Signed-off-by: Xu Kuohai <xukuohai at huawei.com>
>>>> ---
>>>>    arch/x86/net/bpf_jit_comp.c | 23 +++++++++++++++--------
>>>>    1 file changed, 15 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>>>> index 2c57ee446fc9..752331a64fc0 100644
>>>> --- a/arch/x86/net/bpf_jit_comp.c
>>>> +++ b/arch/x86/net/bpf_jit_comp.c
>>>> @@ -1658,8 +1658,8 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
>>>>       return 0;
>>>>    }
>>>>
>>>> -static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image,
>>>> -              int oldproglen, struct jit_context *ctx, bool jmp_padding)
>>>> +static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *addrs, u8 *image,
>>>> +              u8 *rw_image, int oldproglen, struct jit_context *ctx, bool jmp_padding)
>>>>    {
>>>>       bool tail_call_reachable = bpf_prog->aux->tail_call_reachable;
>>>>       struct bpf_insn *insn = bpf_prog->insnsi;
>>>> @@ -1743,6 +1743,11 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
>>>>                               dst_reg = X86_REG_R9;
>>>>               }
>>>>
>>>> +#ifdef CONFIG_X86_KERNEL_IBT
>>>> +            if (bpf_insn_is_indirect_target(env, bpf_prog, i - 1))
>>>> +                    EMIT_ENDBR();
>>>> +#endif
>>>> +
>>>>               switch (insn->code) {
>>>>                       /* ALU */
>>>>               case BPF_ALU | BPF_ADD | BPF_X:
>>>> @@ -2449,7 +2454,7 @@ st:                    if (is_imm8(insn->off))
>>>>
>>>>                       /* call */
>>>>               case BPF_JMP | BPF_CALL: {
>>>> -                    u8 *ip = image + addrs[i - 1];
>>>> +                    u8 *ip = image + addrs[i - 1] + (prog - temp);
>>>
>>> Sorry, meant to reply to v5 but got distracted.
>>> It seems tedious/error prone to have this addend at each location,
>>> would it be possible to move the 'ip' variable calculation outside
>>> of the switch? It appears that at each point there would be no
>>> EMIT invocations between 'ip' computation and usage.
>>>
>>
>> Besides the changes shown in this patch, there is another line in the
>> file computing address using 'image + addrs[i - 1] + (prog - temp)'.
>>
>> It is at the call to emit_return() in the 'BPF_JMP | BPF_EXIT' case.
>> But there are indeed EMIT*() invocations before the address copmutation,
>> so a pre-computed address before the switch statement is stale in
>> this case.
>>
>> To fix this, how about introducing a macro for the address computation,
>> as the following diff (based on this patch) shows:
>>
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
>> @@ -1606,6 +1606,8 @@ static void emit_priv_frame_ptr(u8 **pprog, void __percpu *priv_frame_ptr)
>>
>>    #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
>>
>> +#define CURR_IP      (image + addrs[i - 1] + (prog - temp))
>> +
> 
> No. Don't obfuscate it with macro.
> I don't like INSN_SZ_DIFF either.
> 
> 
OK, will use Eduward's approach as is.




More information about the linux-arm-kernel mailing list