[PATCH 0/1] Fix for riscv vmcore issue
Pnina Feder
PNINA.FEDER at mobileye.com
Mon Jul 14 05:00:27 PDT 2025
>>>Hi Pnina,
>>
>>>> Pnina!
>>>>
>>>> Pnina Feder <pnina.feder at mobileye.com> writes:
>>>>
>>>>> We are creating a vmcore using kexec on a Linux 6.15 RISC-V system
>>>>> and analyzing it with the crash tool on the host. This workflow
>>>>> used to work on Linux 6.14 but is now broken in 6.15.
>>>> Thanks for reporting this!
>>>>
>>>>> The issue is caused by a change in the kernel:
>>>>> In Linux 6.15, certain memblock sections are now marked as Reserved
>>>>> in /proc/iomem. The kexec tool excludes all Reserved regions when
>>>>> generating the vmcore, so these sections are missing from the dump.
>>>>> How are you collecting the /proc/vmcore file? A full set of commands would be helpful.
>>>>
>>> We’ve defined in our system that when a process crashes, we call panic().
>>> To handle crash recovery, we're using kexec with the following command:
>>> kexec -p /Image --initrd=/rootfs.cpio --append "console=${con} earlycon=${earlycon} no4lvl"
>>>
>>> To simulate crash, we trigger it using:
>>> sleep 100 & kill -6 $!
>>>
>>> This boots into the crash kernel (kdump), where we then copy the /proc/vmcore file back to the host for analysis.
>>>
>>>>> However, the kernel still uses addresses in these regions—for
>>>>> example, for IRQ pointers. Since the crash tool needs access to
>>>>> these memory areas to function correctly, their exclusion breaks the analysis.
>>>> Wdym with "IRQ pointers"? Also, what version (sha1) of crash are you using?
>>>>
>>>> We are currently using crash-utility version 9.0.0 (master).
>>>> From the crash analysis logs, we observed errors like:
>>>>
>>> "......
>>> IRQ stack pointer[0] is ffffffd6fbdcc068
>>>> crash: read error: kernel virtual address: ffffffd6fbdcc068 type: "IRQ stack pointer"
>>> .....
>>>
>>>> <read_kdump: addr: ffffffff80edf1cc paddr: 8010df1cc cnt: 4>
>>> <readmem: ffffffd6fbdd6880, KVADDR, "runqueues entry (per_cpu)",
>>> 3456, (FOE), 55acf03963e0>
>>>> read_kdump: addr: ffffffd6fbdd6880 paddr: 8fbdd6880 cnt: 1920<
>>> crash: read error: kernel virtual address: ffffffd6fbdd6880 type: "runqueues entry (per_cpu)"
>>
>>
>>I can't reproduce this issue on qemu, booting with sv39. I'm using the latest kexec-tools (which recently merged riscv .support), crash 9.0.0 and kernel 6.16.0-rc4. Note that I'm using crash in qemu.
>>
>>Are you able to reproduce this on qemu too?
>
>Yes, I am using qemu too on main and crash kernel, with latest kexec-tools, crash 9.0.0 and kernel 6.15
>
>
>>Maybe that's related to the config, can you share your config?
>
>this is my dev_config
>
> CONFIG_SYSVIPC=y
> CONFIG_POSIX_MQUEUE=y
> CONFIG_AUDIT=y
> CONFIG_NO_HZ_IDLE=y
> CONFIG_HIGH_RES_TIMERS=y
> CONFIG_BPF_SYSCALL=y
> CONFIG_PREEMPT_RT=y
> CONFIG_TASKSTATS=y
> CONFIG_TASK_DELAY_ACCT=y
> CONFIG_PSI=y
> CONFIG_IKCONFIG=y
> CONFIG_IKCONFIG_PROC=y
> CONFIG_CGROUPS=y
> CONFIG_MEMCG=y
> CONFIG_CGROUP_SCHED=y
> CONFIG_CFS_BANDWIDTH=y
> CONFIG_RT_GROUP_SCHED=y
> CONFIG_CGROUP_PIDS=y
> CONFIG_CGROUP_FREEZER=y
> CONFIG_CGROUP_HUGETLB=y
> CONFIG_CPUSETS=y
> CONFIG_CGROUP_DEVICE=y
> CONFIG_CGROUP_CPUACCT=y
> CONFIG_CGROUP_PERF=y
> CONFIG_CGROUP_BPF=y
> CONFIG_NAMESPACES=y
> CONFIG_USER_NS=y
> CONFIG_CHECKPOINT_RESTORE=y
> CONFIG_BLK_DEV_INITRD=y
> CONFIG_EXPERT=y
> CONFIG_PROFILING=y
> CONFIG_KEXEC=y
> CONFIG_ARCH_VIRT=y
> CONFIG_NONPORTABLE=y
> CONFIG_SMP=y
> CONFIG_NR_CPUS=32
> CONFIG_HZ_1000=y
> CONFIG_CPU_IDLE=y
> CONFIG_MODULES=y
> CONFIG_MODULE_UNLOAD=y
> CONFIG_IOSCHED_BFQ=y
> CONFIG_PAGE_REPORTING=y
> CONFIG_PERCPU_STATS=y
> CONFIG_NET=y
> CONFIG_PACKET=y
> CONFIG_UNIX=y
> CONFIG_XFRM_USER=m
> CONFIG_INET=y
> CONFIG_IP_MULTICAST=y
> CONFIG_IP_ADVANCED_ROUTER=y
> CONFIG_INET_ESP=m
> CONFIG_NETWORK_SECMARK=y
> CONFIG_NETFILTER=y
> CONFIG_IP_NF_IPTABLES=y
> CONFIG_IP_NF_FILTER=y
> CONFIG_BRIDGE=m
> CONFIG_BRIDGE_VLAN_FILTERING=y
> CONFIG_VLAN_8021Q=m
> CONFIG_NET_SCHED=y
> CONFIG_NET_CLS_CGROUP=m
> CONFIG_NETLINK_DIAG=y
> CONFIG_NET_L3_MASTER_DEV=y
> CONFIG_CGROUP_NET_PRIO=y
> CONFIG_FAILOVER=y
> CONFIG_DEVTMPFS=y
> CONFIG_DEVTMPFS_MOUNT=y
> CONFIG_MTD=y
> CONFIG_MTD_BLOCK=y
> CONFIG_MTD_CFI=y
> CONFIG_MTD_CFI_INTELEXT=y
> CONFIG_MTD_PHYSMAP=y
> CONFIG_MTD_PHYSMAP_OF=y
> CONFIG_BLK_DEV_LOOP=y
> CONFIG_BLK_DEV_LOOP_MIN_COUNT=0
> CONFIG_VIRTIO_BLK=y
> CONFIG_SCSI=y
> CONFIG_BLK_DEV_SD=y
> CONFIG_SCSI_VIRTIO=y
> CONFIG_MD=y
> CONFIG_BLK_DEV_DM=y
> CONFIG_NETDEVICES=y
> CONFIG_MACB=y
> CONFIG_PCS_XPCS=m
> CONFIG_SERIO_LIBPS2=y
> CONFIG_VT_HW_CONSOLE_BINDING=y
> CONFIG_LEGACY_PTY_COUNT=16
> CONFIG_SERIAL_8250=y
> CONFIG_SERIAL_8250_CONSOLE=y
> CONFIG_SERIAL_OF_PLATFORM=y
> CONFIG_SERIAL_EARLYCON_RISCV_SBI=y
> CONFIG_VIRTIO_CONSOLE=y
> CONFIG_HW_RANDOM=y
> CONFIG_HW_RANDOM_VIRTIO=y
> CONFIG_I2C=y
> CONFIG_I2C_DESIGNWARE_CORE=y
> CONFIG_SPI=y
> CONFIG_PINCTRL=y
> CONFIG_PINCTRL_SINGLE=y
> CONFIG_GPIOLIB=y
> CONFIG_GPIO_SYSFS=y
> CONFIG_GPIO_DWAPB=y
> CONFIG_GPIO_SIFIVE=y
> CONFIG_POWER_SUPPLY=y
> CONFIG_WATCHDOG=y
> CONFIG_WATCHDOG_CORE=y
> CONFIG_REGULATOR=y
> CONFIG_REGULATOR_FIXED_VOLTAGE=y
> CONFIG_BACKLIGHT_CLASS_DEVICE=m
> CONFIG_SCSI_UFSHCD=y
> CONFIG_SCSI_UFSHCD_PLATFORM=y
> CONFIG_SCSI_UFS_DWC_TC_PLATFORM=y
> CONFIG_RTC_CLASS=y
> CONFIG_RTC_DRV_M41T80=y
> CONFIG_DMADEVICES=y
> CONFIG_SYNC_FILE=y
> CONFIG_COMMON_CLK_EYEQ=y
> CONFIG_RPMSG_CHAR=y
> CONFIG_RPMSG_CTRL=y
> CONFIG_RPMSG_VIRTIO=y
> CONFIG_RESET_CONTROLLER=y
> CONFIG_RESET_SIMPLE=y
> CONFIG_GENERIC_PHY=y
> CONFIG_EXT4_FS=y
> CONFIG_EXT4_FS_POSIX_ACL=y
> CONFIG_EXT4_FS_SECURITY=y
> CONFIG_MSDOS_FS=y
> CONFIG_VFAT_FS=y
> CONFIG_TMPFS=y
> CONFIG_TMPFS_POSIX_ACL=y
> CONFIG_HUGETLBFS=y
> CONFIG_KEYS=y
> CONFIG_SECURITY=y
> CONFIG_SECURITYFS=y
> CONFIG_SECURITY_NETWORK=y
> CONFIG_SECURITY_PATH=y
> CONFIG_CRYPTO_RSA=y
> CONFIG_CRYPTO_ECB=y
> CONFIG_CRYPTO_BLAKE2B=m
> CONFIG_CRYPTO_XXHASH=m
> CONFIG_CRYPTO_USER_API_HASH=y
> CONFIG_CRC_CCITT=m
> CONFIG_CRC_ITU_T=y
> CONFIG_CRC7=y
> CONFIG_LIBCRC32C=m
> CONFIG_PRINTK_TIME=y
> CONFIG_DYNAMIC_DEBUG=y
> CONFIG_DEBUG_INFO_DWARF5=y
> CONFIG_DEBUG_FS=y
> CONFIG_DEBUG_PAGEALLOC=y
> CONFIG_PTDUMP_DEBUGFS=y
> CONFIG_SCHED_STACK_END_CHECK=y
> CONFIG_DEBUG_VM=y
> CONFIG_DEBUG_VM_PGFLAGS=y
> CONFIG_DEBUG_MEMORY_INIT=y
> CONFIG_DEBUG_PER_CPU_MAPS=y
> CONFIG_SOFTLOCKUP_DETECTOR=y
> CONFIG_WQ_WATCHDOG=y
> CONFIG_DEBUG_RT_MUTEXES=y
> CONFIG_DEBUG_SPINLOCK=y
> CONFIG_DEBUG_ATOMIC_SLEEP=y
> CONFIG_DEBUG_LIST=y
> CONFIG_DEBUG_PLIST=y
> CONFIG_DEBUG_SG=y
> CONFIG_RCU_EQS_DEBUG=y
> CONFIG_MEMTEST=y
>
>>> These failures occur consistently for addresses in the 0xffffffd000000000 region.
>>
>>
>>FYI, this region is the direct mapping (see Documentation/arch/riscv/vm-layout.rst).
>>
>>Thanks,
>>
>>Alex
>>
Hi Alex!
Do I have something to try or help to process this issue?
maybe, can you give your Config and I will try it on my system?
Any more information I can share?
Thanks a lot,
Pnina
>>
>>> Upon inspection, we confirmed that the physical addresses corresponding to those virtual addresses are not present in the vmcore, as they fall under Reserved memory sections.
>>> We tested a patch to kexec-tools that prevents exclusion of the Reserved-memblock section from the vmcore. With this patch, the issue no longer occurs, and crash analysis succeeds.
>>> Note: I suspect the same issue exists on ARM64, as both the signal.c and kexec-tools implementations are similar.
>>>
>>>> Thanks!
>>>> Björn
More information about the linux-riscv
mailing list