[PATCH] arm64: bpf: Fix UBSAN misaligned access in BPF JIT
Xu Kuohai
xukuohai at huaweicloud.com
Wed Feb 25 17:34:04 PST 2026
On 2/25/2026 7:00 PM, Fuad Tabba wrote:
> Hi Xu,
>
> On Wed, 25 Feb 2026 at 09:46, Xu Kuohai <xukuohai at huaweicloud.com> wrote:
>>
>> On 2/25/2026 5:08 PM, Fuad Tabba wrote:
>>> Hi Xu,
>>>
>>> On Wed, 25 Feb 2026 at 01:43, Xu Kuohai <xukuohai at huaweicloud.com> wrote:
>>>>
>>>> On 2/24/2026 5:31 PM, Fuad Tabba wrote:
>>>>> struct bpf_plt contains a u64 'target' field. The BPF JIT allocator
>>>>> was using an alignment of 4 bytes (sizeof(u32)), which could lead
>>>>> to the 'target' field being misaligned in the JIT buffer.
>>>>>
>>>>> Increase the alignment requirement to 8 bytes (sizeof(u64)) in
>>>>> bpf_jit_binary_pack_alloc() to guarantee proper alignment for
>>>>> struct bpf_plt.
>>>>>
>>>>> Fixes: b2ad54e1533e9 ("bpf, arm64: Implement bpf_arch_text_poke() for arm64")
>>>>> Signed-off-by: Fuad Tabba <tabba at google.com>
>>>>> ---
>>>>> arch/arm64/net/bpf_jit_comp.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
>>>>> index 356d33c7a4ae..adf84962d579 100644
>>>>> --- a/arch/arm64/net/bpf_jit_comp.c
>>>>> +++ b/arch/arm64/net/bpf_jit_comp.c
>>>>> @@ -2119,7 +2119,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
>>>>> extable_offset = round_up(prog_size + PLT_TARGET_SIZE, extable_align);
>>>>> image_size = extable_offset + extable_size;
>>>>> ro_header = bpf_jit_binary_pack_alloc(image_size, &ro_image_ptr,
>>>>> - sizeof(u32), &header, &image_ptr,
>>>>> + sizeof(u64), &header, &image_ptr,
>>>>> jit_fill_hole);
>>>>> if (!ro_header) {
>>>>> prog = orig_prog;
>>>>
>>>> Good catch. build_plt pads NOP instructions to ensure a 64-bit relative offset for
>>>> plt target, but it misses the alignment check for image base itself.
>>>>
>>>> Acked-by: Xu Kuohai <xukuohai at huaweicloud.com>
>>>>
>>>> nit: Add check for base alignment in build_plt, or a comment to clarify?
>>>
>>> Thanks for the having a look and for the Ack.
>>>
>>> You're right that build_plt() assumes 64-bit alignment when
>>> calculating the NOP padding. However, Will pointed out, over-aligning
>>> the entire JIT buffer just to satisfy the C standard is somewhat
>>> heavy-handed. I didn't actually run into a functional bug. The issue
>>> is that UBSAN complains because we violate the standard's alignment
>>> rules.
>>> I'll dropping the allocator change in favor of marking struct bpf_plt
>>> as __packed.
>>>
>>
>> Interesting, I think the plt target should be 64-bit aligned to ensure
>> atomic reading on arm64. It can be updated concurrently by WRITE_ONCE
>> in the bpf_arch_text_poke function while the ldr instruction in the plt is
>> executed. If it is not aligned correctly, the ldr may read a half-old
>> half-new value, causing the plt to jump to an invalid destination.
>
> You're right. I missed that target is concurrently updated via
> bpf_arch_text_poke() and read by ldr. If target crosses an 8-byte
> boundary, we lose the single-copy atomicity guarantee, risking a torn
> read. So I guess that this isn't just UBSAN, it could cause real
> issues...
>
>> To avoid over-aligning the entire buffer, how about fixing the padding
>> method in build_plt to just make the plt target aligned correctly?
>
> I'm not sure about this. If my reading of the code is correct, during
> the first JIT pass, ctx->image is NULL. The current padding logic in
> build_plt() looks like this:
>
> /* make sure target is 64-bit aligned */
> if ((ctx->idx + PLT_TARGET_OFFSET / AARCH64_INSN_SIZE) % 2)
> emit(A64_NOP, ctx);
>
> This forces the relative offset of the PLT to be a multiple of 8
> bytes. Therefore, it assumes that the base pointer (ctx->image) is
> also 8-byte aligned. If the allocator gives us a base pointer that is
> only 4-byte aligned, target will end up misaligned.
> If we try to make the padding dynamic based on the actual address of
> ctx->image, pass 1 (where ctx->image is NULL) and pass 2 (where
> ctx->image is allocated) might disagree on the number of NOPs
> required. This would cause ctx->idx to diverge between passes,
> breaking the size calculations and offset tables.
>
Right, I missed it causing ctx->idx to diverge. Thanks for the
explanation!
> Given that enforcing 8-byte alignment in bpf_jit_binary_pack_alloc
> wastes a maximum of 4 bytes per BPF program, I think my original v1
> patch was actually the safest and cleanest way to fix both the UBSAN
> warning and the tearing risk.
>
> What do you think? If you agree, I will abandon v2, and resubmit the
> original one as v3 with an updated commit message detailing your
> observation regarding the atomic read/write requirement.
>
Makes sense to me.
> Thanks,
> /fuad
>
>>
>>> Thanks again,
>>> /fiad
>>
More information about the linux-arm-kernel
mailing list