[PATCH v2 0/8] Fixes for load/store misaligned and access faults

Bo Gan ganboing at gmail.com
Mon Jun 8 14:42:11 PDT 2026


Hi Anirudh,

Glad you asked. The stack overflow issue you saw is exactly the same as
Vivian reported. I just sent another series to fix the issue among others:

https://lore.kernel.org/opensbi/20260608211703.571-1-ganboing@gmail.com/T/#t

It optimizes stack usage, so we'll never overflow the stack again, also
being vlen-agnostic. If you could, please help review and validate, as I
don't have any real HW that supports rvv 1.0, except perhaps the quite old
K230, which in itself is a pain to test with. Thanks. I'm hesitant to go
the default stack-size bump route, as it may cause memory space issues on
machines with huge number of cores or tiny embedded ones with limited ram.

BTW, PATCH 8 can be run anywhere, but it's for scalar misaligned ld/st
only. It doesn't test vector load/store.

Bo

On 6/8/26 11:18, Anirudh Srinivasan wrote:
> Hello Bo,
> 
> 
> On Fri, Jun 5, 2026 at 6:34 AM Bo Gan <ganboing at gmail.com> wrote:
>>
>> Re-visit the load/store misaligned and access fault handlers to fix
>> issues related to coding patterns, floating-point state, and instruction
>> decoding:
> 
> I had previously reported here [1] that there were issues booting into
> linux after enabling misaligned trap delegation to linux on Sifive
> X280. In the discussion over there, we concluding that bumping up the
> per hart stack size in opensbi fixed the issue.
> 
> This series (without the stack size bump) also seems to fix the issues
> that prevented linux from booting. Particularly it was this patch
> "lib: sbi: Do not override emulator callback for vector load/store".
> 
> But as you say, I still think the stack size bump is needed as I was
> able to break the boot by adding some debug prints like this along the
> way. I guess this must have somehow caused the stack to overflow?
> 
> diff --git a/lib/sbi/sbi_trap_v_ldst.c b/lib/sbi/sbi_trap_v_ldst.c
> index 57f12b83..5e596664 100644
> --- a/lib/sbi/sbi_trap_v_ldst.c
> +++ b/lib/sbi/sbi_trap_v_ldst.c
> @@ -16,6 +16,7 @@
>   #include <sbi/sbi_trap.h>
>   #include <sbi/sbi_unpriv.h>
>   #include <sbi/sbi_trap.h>
> +#include <sbi/sbi_console.h>
> 
>   #ifdef OPENSBI_CC_SUPPORT_VECTOR
> 
> @@ -139,6 +140,8 @@ static inline void vsetvl(ulong vl, ulong vtype)
> 
>   int sbi_misaligned_v_ld_emulator(ulong insn, struct sbi_trap_context *tcntx)
>   {
> +       sbi_printf("%s: insn=0x%lx mepc=0x%lx mtval=0x%lx\n",
> +                  __func__, insn, tcntx->regs.mepc, tcntx->trap.tval);
>          const struct sbi_trap_info *orig_trap = &tcntx->trap;
>          struct sbi_trap_regs *regs = &tcntx->regs;
>          struct sbi_trap_info uptrap;
> @@ -238,6 +241,8 @@ int sbi_misaligned_v_ld_emulator(ulong insn,
> struct sbi_trap_context *tcntx)
> 
>   int sbi_misaligned_v_st_emulator(ulong insn, struct sbi_trap_context *tcntx)
>   {
> +       sbi_printf("%s: insn=0x%lx mepc=0x%lx mtval=0x%lx\n",
> +                  __func__, insn, tcntx->regs.mepc, tcntx->trap.tval);
>          const struct sbi_trap_info *orig_trap = &tcntx->trap;
>          struct sbi_trap_regs *regs = &tcntx->regs;
>          struct sbi_trap_info uptrap;
> 
> 
> [    0.075576] clocksource: jiffies: mask: 0xffffffff max_cycles:
> 0xffffffff, max_idle_ns: 7645041785100000 @Z0@Υ0 at 1] posix`Z0@Υ0 at s:
> 2048 T-0[0@�[0 at A0@%�Z0@��0@%�Z0@Υ0 at B,
> linea[0@Υ0@�Z0@ڤ0 at T-0[0@ڤ0 at T-0@([0@�[0A0@^^�[0@��0 at 2 KiB
> GF�0@�[0 at fA0@8.Q������~0@'`fR0@�0@�[0@(�0@`}0@�[0 at dit_enab�~0 at p}0@�l0 at c`�����/0@'`�o�����a@����
> 
> 
> 
> 
>                                         @�\0@`����[    0.155689] cpu1:
> Ratio of byte access time to unaligned word access is 0.01, unaligned
> accesses are slow
> [    0.155689] cpu3: Ratio of byte access time to unaligned word
> access is 0.01, unaligned accesses are slow
> [    0.183733] cpu0: Ratio of byte access time to unaligned word
> access is 0.01, unaligned accesses are slow
> sbi_misaligned_v_ld_emulator: insn=0x207d007 mepc=0xffffffff80015efc
> mtval=0xffff8f8000073d51
> sbi_misaligned_v_ld_emulator: insn=0x207d007 mepc=0xffffffff80015efc
> mtval=0xffff8f800013bd51
> sbi_misaligned_v_ld_emulator: insn=0x207d007 mepc=0xffffffff80015efc
> mtval=0xffff8f80000ebd51
> sbi_misaligned_v_ld_emulator: insn=0x207d007 mepc=0xffffffff80015efc
> mtval=0xffff8f8000113d51
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf8001b72003
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf8002112003
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf800214e003
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf8002026003
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf8001b70001
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf8002110001
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf800214c001
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf8002024001
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf8001b72023
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf8002112023
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf800214e023
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf8002026023
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf8001b70021
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf8002110021
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf800214c021
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf8002024021
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf8001b72043
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf8002112043
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf800214e043
> sbi_misaligned_v_ld_emulator: insn=0x205e007 mepc=0xffffffff80016fee
> mtval=0xffffaf8002026043
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf8001b70041
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf8002110041
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf800214c041
> sbi_misaligned_v_st_emulator: insn=0x2056027 mepc=0xffffffff80016ff2
> mtval=0xffffaf8002024041
> sbi_misaligned_v_l�X0 at lS0@�X0@�0 at f8001b72Y0@Y0 at M0@
> 
> �0@��������80021120�0@��������emulatoB0@^b�Z0@
> 
>0 at b�0@`Z0 at h�M������
> 
>                                               ������0@�6�Z0 at bb^
> 
>                                                                HL0 at B0@
> 
> 
> �([0@�Z0@    a��0@"
> 
> 
>                  �b�~0@��0 at ator: insn=0x205[0 at fA0@
> 
> 
>                                                  B0@^
> 
> 
>                                                      A0@[0 at sb
> 
> 
> 
> Bumping the stack size seems to fix this. If you want me to test
> anything else, let me know. Not sure if the tests in PATCH 8 are
> appropriate to test on this hw (I think they are for KVM guests
> only?).
> 
> [1] - https://lore.kernel.org/linux-riscv/nrvt74qnojaubiwjo37ums4lnclu466hovwrhmtbag6f5uhrql@q6msoe2oto4b/
> 
> 
> 
> 
> 
>>   - tinst should be zero'ed out to not confuse previous mode when
>>     redirecting faults, otherwise the vector insn can be mistaken
>>     as a regular load/store.
>>   - VS in previous mode must be set dirty for loads.
>>
>> These will be addressed in follow-up patches.
>>
>> [1] https://github.com/ganboing/qemu/tree/ganboing-misalign
>> [2] https://github.com/ganboing/qemu/tree/ganboing-misalign-no-tinst
>> [3] https://github.com/ganboing/opensbi/tree/fix-ldst-v2
>> ---
>> Changes in v2:
>>   - Addressed Anup's comment for PATCH 5 in v1
>>   - Validate load/store offset is 0 in misaligned faults w/ DEBUG build
>>
>> ---
>> Bo Gan (8):
>>    include: sbi: Add more mstatus and instruction encoding
>>    include: sbi: Add sbi_regs_prev_xlen
>>    include: sbi: Add GET_RDS_NUM/SET(_FP32/_FP64)_RDS macros
>>    include: sbi: set FS dirty in vsstatus when V=1
>>    lib: sbi: Do not override emulator callback for vector load/store
>>    Makefile: define OPENSBI_DEBUG if DEBUG builds
>>    lib: sbi: Rework load/store emulator instruction decoding
>>    [NOT-FOR-UPSTREAM] Test program for misaligned load/store
>>
>>   Makefile                     |   1 +
>>   include/sbi/riscv_encoding.h |  21 +-
>>   include/sbi/riscv_fp.h       |  30 ++-
>>   include/sbi/sbi_platform.h   |  92 +++++--
>>   include/sbi/sbi_trap.h       |  59 ++++
>>   include/sbi/sbi_trap_ldst.h  |   4 +-
>>   lib/sbi/sbi_trap_ldst.c      | 510 ++++++++++++++++++++++++-----------
>>   lib/sbi/sbi_trap_v_ldst.c    |  25 +-
>>   tests/ldst.S                 | 134 +++++++++
>>   tests/ldst.h                 | 170 ++++++++++++
>>   tests/test-misaligned-ldst.c | 154 +++++++++++
>>   11 files changed, 994 insertions(+), 206 deletions(-)
>>   create mode 100644 tests/ldst.S
>>   create mode 100644 tests/ldst.h
>>   create mode 100644 tests/test-misaligned-ldst.c
>>
>> --
>> 2.34.1
>>




More information about the opensbi mailing list