[BUG] vmcore-dmesg cant' read dmesg log from /proc/vmcore if log_buf is reallocated due to large number of CPUs
bhsharma at redhat.com
Wed Oct 24 14:30:08 PDT 2018
On Wed, Oct 24, 2018 at 6:23 PM Lomovtsev, Vadim
<Vadim.Lomovtsev at cavium.com> wrote:
> Hi all,
> Following issue has been found for vmcore-dmesg app with latest release (94159bc3c264fa26395e56302072276a139d18af 2.0.18-rc1) of kexec-tools at CentOS 7.5 distro:
> While having systems with large number of CPUs (e.g. Cavium ThunderX2 has 224) the log_buf gets reallocated by memblock_virt_alloc() at the setup_log_buf routine (https://elixir.bootlin.com/linux/v4.16.18/source/kernel/printk/printk.c#L1108).
> Then while dumping vmcore the vmcore-dmesg can't find dmesg log at /proc/vmcore file and exits with following message:
> Failed to read log text of size 0 bytes: Bad address
> However it (vmcore-dmesg app) reads properly the log_buf symbol, it's address and eventually it's value from /proc/vmcore but fails to find dmesg data then.
> In the same time the makedumpfile is able to find and extract dmesg buffer from /proc/vmcore.
> The makedumpfile comes with kexec-tools-2.0.15-13.el7_5.2.aarch64 package.
> The issue is not reproduced for systems with small number of CPUs and log_buf not reallocated to memblock section.
Seems like you are hitting a known issue we saw on qualcomm amberwing
platforms as well.
I have sent a patch-series titled 'kexec-tools/arm64: Add support to
read PHYS_OFFSET from vmcoreinfo inside '/proc/kcore' to this list
just a few minutes back.
I have Cc'ed you to the patchset as I think it might fix the issue for
you. Kindly try the patchset on your platform (cavium?) and let me
know if this fixes the issue for you.
More information about the kexec