[PATCH 0/9] Improve trap handling for nested traps
Bo Gan
ganboing at gmail.com
Tue Mar 12 00:59:47 PDT 2024
On 3/12/24 12:43 AM, Anup Patel wrote:
> On Tue, Mar 12, 2024 at 11:11 AM Bo Gan <ganboing at gmail.com> wrote:
>>
>> On 3/11/24 10:18 PM, Anup Patel wrote:
>>> On Tue, Mar 12, 2024 at 10:03 AM Bo Gan <ganboing at gmail.com> wrote:
>>>>
>>>> On 3/11/24 8:43 PM, Anup Patel wrote:
>>>>> On Tue, Mar 12, 2024 at 6:31 AM Bo Gan <ganboing at gmail.com> wrote:
>>>>>>
>>>>>> On 3/11/24 9:09 AM, Anup Patel wrote:
>>>>>>> Nested traps will be a common when dealing with RAS error traps so
>>>>>>> this series improves trap handling for nested traps by introducing
>>>>>>> a linked-list based trap context chain.
>>>>>>>
>>>>>>> These patches can also be found the trap_handling_imp_v1 branch at
>>>>>>> https://github.com/avpatel/opensbi.git
>>>>>>>
>>>>>>> Anup Patel (9):
>>>>>>> lib: sbi: Remove sbi_trap_exit() and related code
>>>>>>> include: sbi: Add trap_context pointer in struct sbi_scratch
>>>>>>> lib: sbi: Introduce trap context
>>>>>>> lib: sbi: Simplify parameters of misaligned and access fault handlers
>>>>>>> lib: sbi: Simplify parameters of sbi_illegal_insn_handler()
>>>>>>> lib: sbi: Remove regs paramter of sbi_irqchip_process()
>>>>>>> lib: sbi: Remove regs parameter from trap irq handling functions
>>>>>>> lib: sbi: Pass trap context pointer to sbi_ecall_handler()
>>>>>>> lib: sbi: Extend sbi_trap_error() to dump state in a nested trap
>>>>>>>
>>>>>>> firmware/fw_base.S | 14 +--
>>>>>>> include/sbi/sbi_ecall.h | 4 +-
>>>>>>> include/sbi/sbi_illegal_insn.h | 4 +-
>>>>>>> include/sbi/sbi_irqchip.h | 5 +-
>>>>>>> include/sbi/sbi_scratch.h | 14 +--
>>>>>>> include/sbi/sbi_trap.h | 24 ++++-
>>>>>>> include/sbi/sbi_trap_ldst.h | 12 +--
>>>>>>> lib/sbi/sbi_ecall.c | 3 +-
>>>>>>> lib/sbi/sbi_illegal_insn.c | 14 +--
>>>>>>> lib/sbi/sbi_irqchip.c | 10 +-
>>>>>>> lib/sbi/sbi_trap.c | 186 +++++++++++++++++----------------
>>>>>>> lib/sbi/sbi_trap_ldst.c | 67 ++++++------
>>>>>>> lib/utils/irqchip/imsic.c | 2 +-
>>>>>>> 13 files changed, 185 insertions(+), 174 deletions(-)
>>>>>>>
>>>>>>
>>>>>> Hi Anup,
>>>>>>
>>>>>> Can you help providing an example for nested traps and the RAS extension?
>>>>>> The closest spec I can find (https://github.com/riscv/riscv-ssrastraps) is
>>>>>> still empty. I'm wondering where I can find the related documentation.
>>>>>
>>>>> Refer, "3.1.15. Machine Cause Register" of the draft Priv v1.13 specification.
>>>>> https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-157641b-2024-03-12/priv-isa-asciidoc.pdf
>>>>>
>>>>> Regards,
>>>>> Anup
>>>>>
>>>>
>>>>
>>>> Hi Anup, Thanks for the pointer. I know what nested trap is. My concern is when do
>>>> we expect such trap to happen? Does it mean for every memory access in M mode, it
>>>> might trigger a RAS fault, causing a nested trap? Or even the fault can be delivered
>>>> asynchronously?
>>>
>>> RAS error can occur at any time. A synchronous RAS errors will be taken
>>> as an exception (mcause = 19) whereas asynchronous RAS errors will be
>>> taken as RAS local interrupt or RAS external interrupt (through interrupt
>>> controller).
>>>
>>> Only RAS synchronous errors can cause nested trap.
>>>
>>>> If that's the case, how can we even safely handle nested traps? E.g.,
>>>> The entry of _trap_handler doesn't look like reentrant-safe to me. Perhaps you plan
>>>> to enhance it in later patches.
>>>>
>>>
>>> The _trap_handler pushes the register state on stack. It already takes care
>>> of nesting by continuing the same SP if the trap was taken while in M-mode.
>>>
>>> If _trap_handler() did not support nesting then I would not be able to test
>>> this series.
>>>
>>> Refer, TRAP_SAVE_AND_SETUP_SP_T0() in fw_base.S
>>>
>>> Regards,
>>> Anup
>>>
>>
>> I'm looking at the exact code you are pointing.
>>
>>> .macro TRAP_SAVE_AND_SETUP_SP_T0
>>> /* Swap TP and MSCRATCH */
>>> csrrw tp, CSR_MSCRATCH, tp
>>
>> ===> what if RAS exception gets triggered here?
>>
>>> /* Save T0 in scratch space */
>>> REG_S t0, SBI_SCRATCH_TMP0_OFFSET(tp)
>>>
>>
>> ===> Or here?
>>
>>> /*
>>> * Set T0 to appropriate exception stack
>>> *
>>> * Came_From_M_Mode = ((MSTATUS.MPP < PRV_M) ? 1 : 0) - 1;
>>> * Exception_Stack = TP ^ (Came_From_M_Mode & (SP ^ TP))
>>> *
>>> * Came_From_M_Mode = 0 ==> Exception_Stack = TP
>>> * Came_From_M_Mode = -1 ==> Exception_Stack = SP
>>> */
>>
>> ===> Or somewhere below?
>>
>>> csrr t0, CSR_MSTATUS
>>> srl t0, t0, MSTATUS_MPP_SHIFT
>>> and t0, t0, PRV_M
>> .....
>>
>> We only have one spare register CSR_MSCRATCH to work with before we
>> can work on stack. How are we suppose to do the nested trap safely?
>> Am i missing something?
>>
>
> If we get a nested trap before _trap_handler() gets a chance to save
> state then nested trap handling will not work.
>
> The _trap_handler() functionally supports nesting but it is certainly
> not 100% safe due to lack of additional HW support.
>
> The upcoming double trap extensions (particularly Smdbltrp) will
> help us provide 100% safety in the low-level _trap_handler().
> (Refer, https://wiki.riscv.org/display/HOME/RISC-V+Specification+Status)
>
> Think of this series as preparatory work for using double trap
> extension in OpenSBI.
>
> Regards,
> Anup
>
Thanks Anup. This double trap extension fills the void. Now this whole
series make a lot more sense to me with all these background information.
Bo
More information about the opensbi
mailing list