[PATCH 0/1] Fix for riscv vmcore issue
Pnina Feder
PNINA.FEDER at mobileye.com
Thu Jul 3 05:06:32 PDT 2025
> Pnina!
>
> Pnina Feder <pnina.feder at mobileye.com> writes:
>
> > We are creating a vmcore using kexec on a Linux 6.15 RISC-V system and
> > analyzing it with the crash tool on the host. This workflow used to
> > work on Linux 6.14 but is now broken in 6.15.
>
> Thanks for reporting this!
>
> > The issue is caused by a change in the kernel:
> > In Linux 6.15, certain memblock sections are now marked as Reserved in
> > /proc/iomem. The kexec tool excludes all Reserved regions when
> > generating the vmcore, so these sections are missing from the dump.
>
> How are you collecting the /proc/vmcore file? A full set of commands would be helpful.
>
We’ve defined in our system that when a process crashes, we call panic().
To handle crash recovery, we're using kexec with the following command:
kexec -p /Image --initrd=/rootfs.cpio --append "console=${con} earlycon=${earlycon} no4lvl"
To simulate crash, we trigger it using:
sleep 100 & kill -6 $!
This boots into the crash kernel (kdump), where we then copy the /proc/vmcore file back to the host for analysis.
> > However, the kernel still uses addresses in these regions—for example,
> > for IRQ pointers. Since the crash tool needs access to these memory
> > areas to function correctly, their exclusion breaks the analysis.
>
> Wdym with "IRQ pointers"? Also, what version (sha1) of crash are you using?
>
We are currently using crash-utility version 9.0.0 (master).
From the crash analysis logs, we observed errors like:
"......
IRQ stack pointer[0] is ffffffd6fbdcc068
crash: read error: kernel virtual address: ffffffd6fbdcc068 type: "IRQ stack pointer"
.....
<read_kdump: addr: ffffffff80edf1cc paddr: 8010df1cc cnt: 4>
<readmem: ffffffd6fbdd6880, KVADDR, "runqueues entry (per_cpu)", 3456, (FOE), 55acf03963e0>
>read_kdump: addr: ffffffd6fbdd6880 paddr: 8fbdd6880 cnt: 1920<
crash: read error: kernel virtual address: ffffffd6fbdd6880 type: "runqueues entry (per_cpu)"
These failures occur consistently for addresses in the 0xffffffd000000000 region.
Upon inspection, we confirmed that the physical addresses corresponding to those virtual addresses are not present in the vmcore, as they fall under Reserved memory sections.
We tested a patch to kexec-tools that prevents exclusion of the Reserved-memblock section from the vmcore. With this patch, the issue no longer occurs, and crash analysis succeeds.
Note: I suspect the same issue exists on ARM64, as both the signal.c and kexec-tools implementations are similar.
>
> Thanks!
> Björn
More information about the linux-riscv
mailing list