[PATCH -next] riscv/eBPF: Add BPF exception tables
Björn Töpel
bjorn.topel at gmail.com
Sun Oct 17 11:40:59 PDT 2021
Firstly, thanks a lot for working on the exception fixup handling!
Specific comments below, but first some generic input.
For the subject, please use "riscv, bpf" or "riscv: bpf:" instead of
"riscv/eBPF:", and target the bpf-next tree (details in
Documentation/bpf/bpf_devel_QA.rst).
Also, it would be really nice if the RISC-V 32b support got the same
fixup love! ;-)
I haven't taken the code for a spin/selftest run yet, but I'll try to
do it this week!
Subjective style input: Please use the full 100 chars lines, now that
we can! ;-)
On Fri, 15 Oct 2021 at 17:37, Tong Tiangen <tongtiangen at huawei.com> wrote:
>
> When a tracing BPF program attempts to read memory without using the
> bpf_probe_read() helper, the verifier marks the load instruction with
> the BPF_PROBE_MEM flag. Since the riscv JIT does not currently recognize
> this flag it falls back to the interpreter.
>
> Add support for BPF_PROBE_MEM, by appending an exception table to the
> BPF program. If the load instruction causes a data abort, the fixup
> infrastructure finds the exception table and fixes up the fault, by
> clearing the destination register and jumping over the faulting
> instruction.
>
> A more generic solution would add a "handler" field to the table entry.
>
So, would it make sense to add a handler field, to be more consistent
with say, x86?
This code is heavily based on the ARM64 one. Would a "handler", make
the ARM64 code simpler as well?
> The same issue in ARM64 is fixed in:
> commit 800834285361 ("bpf, arm64: Add BPF exception tables"),
>
> Signed-off-by: Tong Tiangen <tongtiangen at huawei.com>
> Tested-by: Pu Lehui <pulehui at huawei.com>
> ---
> arch/riscv/include/asm/Kbuild | 1 -
> arch/riscv/include/asm/extable.h | 49 ++++++++
> arch/riscv/include/asm/uaccess.h | 13 ---
> arch/riscv/mm/extable.c | 13 ++-
> arch/riscv/net/bpf_jit.h | 1 +
> arch/riscv/net/bpf_jit_comp64.c | 187 +++++++++++++++++++++++++------
> arch/riscv/net/bpf_jit_core.c | 18 ++-
> 7 files changed, 222 insertions(+), 60 deletions(-)
> create mode 100644 arch/riscv/include/asm/extable.h
>
> diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
> index 445ccc97305a..57b86fd9916c 100644
> --- a/arch/riscv/include/asm/Kbuild
> +++ b/arch/riscv/include/asm/Kbuild
> @@ -1,6 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0
> generic-y += early_ioremap.h
> -generic-y += extable.h
> generic-y += flat.h
> generic-y += kvm_para.h
> generic-y += user.h
> diff --git a/arch/riscv/include/asm/extable.h b/arch/riscv/include/asm/extable.h
> new file mode 100644
> index 000000000000..8a52cdd122de
> --- /dev/null
> +++ b/arch/riscv/include/asm/extable.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_EXTABLE_H
> +#define __ASM_EXTABLE_H
> +
> +/*
> + * The exception table consists of pairs of addresses: the first is the
> + * address of an instruction that is allowed to fault, and the second is
> + * the address at which the program should continue. No registers are
> + * modified, so it is entirely up to the continuation code to figure out
> + * what to do.
> + *
> + * All the routines below use bits of fixup code that are out of line
> + * with the main instruction path. This means when everything is well,
> + * we don't even have to jump over them. Further, they do not intrude
> + * on our cache or tlb entries.
> + */
> +struct exception_table_entry {
> + unsigned long insn, fixup;
> +};
> +
RISC-V uses the generic exception_table_entry (linux/extable.h), so I
don't see why it should be redefined here, without any
changes.
I'd suggest that instead of creating a new extable.h, simply fold the
changes directly in the mm/except.c. More on that below.
Unless, the "handler field" route is taken. Then, the new
exception_table_entry should go here.
> +#ifdef CONFIG_BPF_JIT
> +static inline bool in_bpf_jit(struct pt_regs *regs)
> +{
> + if (!IS_ENABLED(CONFIG_BPF_JIT))
> + return false;
> +
> + return regs->epc >= BPF_JIT_REGION_START &&
> + regs->epc < BPF_JIT_REGION_END;
Nit/FYI: 100 char lines is OK nowadays! ;-)
> +}
> +
> +int rv_bpf_fixup_exception(const struct exception_table_entry *ex,
> + struct pt_regs *regs);
> +#else /* !CONFIG_BPF_JIT */
> +static inline bool in_bpf_jit(struct pt_regs *regs)
> +{
> + return 0;
> +}
> +
> +static inline
> +int rv_bpf_fixup_exception(const struct exception_table_entry *ex,
> + struct pt_regs *regs)
> +{
> + return 0;
> +}
> +#endif /* !CONFIG_BPF_JIT */
> +
> +struct pt_regs;
> +extern int fixup_exception(struct pt_regs *regs);
> +#endif
> diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
> index f314ff44c48d..96ea91dc0e9c 100644
> --- a/arch/riscv/include/asm/uaccess.h
> +++ b/arch/riscv/include/asm/uaccess.h
> @@ -56,19 +56,6 @@ static inline int __access_ok(unsigned long addr, unsigned long size)
> return size <= TASK_SIZE && addr <= TASK_SIZE - size;
> }
>
> -/*
> - * The exception table consists of pairs of addresses: the first is the
> - * address of an instruction that is allowed to fault, and the second is
> - * the address at which the program should continue. No registers are
> - * modified, so it is entirely up to the continuation code to figure out
> - * what to do.
> - *
> - * All the routines below use bits of fixup code that are out of line
> - * with the main instruction path. This means when everything is well,
> - * we don't even have to jump over them. Further, they do not intrude
> - * on our cache or tlb entries.
> - */
> -
...and don't remove this...
> #define __LSW 0
> #define __MSW 1
>
> diff --git a/arch/riscv/mm/extable.c b/arch/riscv/mm/extable.c
> index 2fc729422151..f8055c6d0f32 100644
> --- a/arch/riscv/mm/extable.c
> +++ b/arch/riscv/mm/extable.c
> @@ -16,9 +16,12 @@ int fixup_exception(struct pt_regs *regs)
> const struct exception_table_entry *fixup;
>
> fixup = search_exception_tables(regs->epc);
> - if (fixup) {
> - regs->epc = fixup->fixup;
> - return 1;
> - }
> - return 0;
> + if (!fixup)
> + return 0;
> +
> + if (in_bpf_jit(regs))
> + return rv_bpf_fixup_exception(fixup, regs);
> +
...instead add the definition of in_bpf_jit() here, and the
CONFIG_BPF_JIT around the "if (in_bpf_jit()...)"
> + regs->epc = fixup->fixup;
> + return 1;
> }
> diff --git a/arch/riscv/net/bpf_jit.h b/arch/riscv/net/bpf_jit.h
> index 75c1e9996867..82f717cc98f7 100644
> --- a/arch/riscv/net/bpf_jit.h
> +++ b/arch/riscv/net/bpf_jit.h
> @@ -71,6 +71,7 @@ struct rv_jit_context {
> int ninsns;
> int epilogue_offset;
> int *offset; /* BPF to RV */
> + int exentry_idx;
I'd prefer if this would be named "nexentries" or something, so it's
more consistent with "ninsns", clear that "this is number of extable
entries".
> unsigned long flags;
> int stack_size;
> };
> diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
> index 3af4131c22c7..97411d43785e 100644
> --- a/arch/riscv/net/bpf_jit_comp64.c
> +++ b/arch/riscv/net/bpf_jit_comp64.c
> @@ -5,6 +5,7 @@
> *
> */
>
> +#include <linux/bitfield.h>
> #include <linux/bpf.h>
> #include <linux/filter.h>
> #include "bpf_jit.h"
> @@ -27,6 +28,21 @@ static const int regmap[] = {
> [BPF_REG_AX] = RV_REG_T0,
> };
>
> +static const int pt_regmap[] = {
> + [RV_REG_A5] = offsetof(struct pt_regs, a5),
> + [RV_REG_A0] = offsetof(struct pt_regs, a0),
> + [RV_REG_A1] = offsetof(struct pt_regs, a1),
> + [RV_REG_A2] = offsetof(struct pt_regs, a2),
> + [RV_REG_A3] = offsetof(struct pt_regs, a3),
> + [RV_REG_A4] = offsetof(struct pt_regs, a4),
> + [RV_REG_S1] = offsetof(struct pt_regs, s1),
> + [RV_REG_S2] = offsetof(struct pt_regs, s2),
> + [RV_REG_S3] = offsetof(struct pt_regs, s3),
> + [RV_REG_S4] = offsetof(struct pt_regs, s4),
> + [RV_REG_S5] = offsetof(struct pt_regs, s5),
> + [RV_REG_T0] = offsetof(struct pt_regs, t0),
> +};
> +
> enum {
> RV_CTX_F_SEEN_TAIL_CALL = 0,
> RV_CTX_F_SEEN_CALL = RV_REG_RA,
> @@ -440,6 +456,71 @@ static int emit_call(bool fixed, u64 addr, struct rv_jit_context *ctx)
> return 0;
> }
>
> +#define BPF_FIXUP_OFFSET_MASK GENMASK(26, 0)
> +#define BPF_FIXUP_REG_MASK GENMASK(31, 27)
> +
> +int rv_bpf_fixup_exception(const struct exception_table_entry *ex,
> + struct pt_regs *regs)
> +{
> + off_t offset = FIELD_GET(BPF_FIXUP_OFFSET_MASK, ex->fixup);
> + int regs_offset = FIELD_GET(BPF_FIXUP_REG_MASK, ex->fixup);
> +
> + *(unsigned long *)((unsigned char *)regs + pt_regmap[regs_offset]) = 0;
> + regs->epc = (unsigned long)&ex->fixup - offset;
> +
> + return 1;
> +}
> +
> +/* For accesses to BTF pointers, add an entry to the exception table */
> +static int add_exception_handler(const struct bpf_insn *insn,
> + struct rv_jit_context *ctx,
> + int dst_reg, int insn_len)
> +{
> + off_t offset;
> + unsigned long pc;
> + struct exception_table_entry *ex;
> +
Nit: Please use reverse xmas tree sorting (longest lines first).
[...]
Cheers,
Björn
More information about the linux-riscv
mailing list