[PATCH v2] riscv: Dump user opcode bytes on fatal faults

Wed Aug 16 08:10:51 PDT 2023

Hi Yunhui,

Waking up the dead! ;-)

Yunhui Cui <cuiyunhui at bytedance.com> writes:

> We encountered such a problem that when the system starts to execute
> init, init exits unexpectedly with error message: "unhandled signal 4
> code 0x1 ...".
>
> We are more curious about which instruction execution caused the
> exception. After dumping it through show_opcodes(), we found that it
> was caused by a floating-point instruction.
>
> In this way, we found the problem: in the system bringup , it is
> precisely that we have not enabled the floating point function(CONFIG_FPU
> is set, but not enalbe COMPAT_HWCAP_ISA_F/D in the dts or acpi).
>
> Like commit ba54d856a9d8 ("x86/fault: Dump user opcode bytes on fatal
> faults"), when an exception occurs, it is necessary to dump the
> instruction that caused the exception.

X86's show_opcodes() is used both for kernel oops:es, and userland
unhandled signals. On RISC-V there's dump_kernel_instr() added in commit
eb165bfa8eaf ("riscv: Add instruction dump to RISC-V splats").

Wdyt about reworking that function, so that it works for userland epc as
well? I think it's useful to have the surrounding instruction context,
and not just on instruction.

Björn