[PATCH bpf-next v2] bpf, arm64: Optimize BPF store/load using str/ldr with immediate offset
Xu Kuohai
xukuohai at huawei.com
Mon Mar 14 18:55:03 PDT 2022
On 2022/3/15 6:11, Daniel Borkmann wrote:
> On 3/14/22 9:48 AM, Xu Kuohai wrote:
>> The current BPF store/load instruction is translated by the JIT into two
>> instructions. The first instruction moves the immediate offset into a
>> temporary register. The second instruction uses this temporary register
>> to do the real store/load.
>>
>> In fact, arm64 supports addressing with immediate offsets. So This patch
>> introduces optimization that uses arm64 str/ldr instruction with
>> immediate
>> offset when the offset fits.
>>
>> Example of generated instuction for r2 = *(u64 *)(r1 + 0):
>>
>> without optimization:
>> mov x10, 0
>> ldr x1, [x0, x10]
>>
>> with optimization:
>> ldr x1, [x0, 0]
>>
>> If the offset is negative, or is not aligned correctly, or exceeds max
>> value, rollback to the use of temporary register.
>>
>> Result for test_bpf:
>> # dmesg -D
>> # insmod test_bpf.ko
>> # dmesg | grep Summary
>> test_bpf: Summary: 1009 PASSED, 0 FAILED, [997/997 JIT'ed]
>> test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
>> test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED
>>
>> Signed-off-by: Xu Kuohai <xukuohai at huawei.com>
> [...]
>
> Thanks for working on this and also including the result for test_bpf!
> Does it also contain corner cases where the rollback to the temporary register is
> triggered? (If not, lets add more test cases to it.)
Yes, I'll check and add some corner cases.
>
> Could you split this into two patches, one that touches
> arch/arm64/lib/insn.c
> and arch/arm64/include/asm/insn.h for the instruction encoder, and then the
> other part for the JIT-only bits?
OK, will split in v3.
>
> Will, would you be okay if we route this via bpf-next with your Ack, or
> do we need to pull feature branch again?
>
I'm fine with bpf-next.
> Thanks,
> Daniel
> .
More information about the linux-arm-kernel
mailing list