kdump: need help with kexec -p
AKASHI Takahiro
takahiro.akashi at linaro.org
Fri Oct 13 01:36:54 PDT 2017
On Thu, Oct 12, 2017 at 12:40:35PM +0100, James Morse wrote:
> Hi Prabhakar,
>
> (+CC: Akashi Takahiro, who wrote the arm64 kdump support)
Thanks.
> On 11/10/17 10:11, Prabhakar Kushwaha wrote:
> > We are facing some issues while using kexec -p on ARM64 NXP platforms.
> >
> > 1) After calling kexec -p, if immediately "panic" is triggered the crash kernel
> > does not boot. If we run few commands and wait for atleast (20-30 secs), before
> > triggering the panic, the crash kernel boots.
>
> What kernel version do you see this on?
Now I know, from his private e-mail, that this only happens
on lsk(Linaro Stable Kernel) 4.4, to which I also backported my dump :)
So, first, I would like to determine whether this issue is really
lsk-specific or not.
Thanks,
-Takahiro AKASHI
> Can you log the kernel output in each
> case, (do you get a 'bye' message even when the new kernel doesn't boot).
>
> Does 'kexec -p' report success in both cases? ($? == 0)
>
>
> kdump can take many seconds in purgatory, it checksums the kdump image to check
> it didn't get corrupted between 'kexec -p' and crash time, but it doesn't sound
> like this is what you're seeing.
>
>
> > 2) We do not see the issue ("1" ), when we do umount -a, before calling the panic
> > after kexec-p.
>
> What filesystems (ext4, nfs etc) do you have mounted, and which ones does
> 'umount -a' get rid of?
> Where are these filesystems stored?
>
> How many CPUs does your platform have?
>
> (...does crashing on a different CPU change the behaviour?)
> > taskset -c 1 bash -c "echo c > /proc/sysrq-trigger"
>
>
> > The issue does not seem to pertain to the NXP software it seems. (because this
> > observation has been observed on very simple kernel, where most of the
> > controllers have been removed from device tree).
>
> > Also found some info related to this on internet where it is mentioned that
> > without un-mounting the mounted filesystems, the boot of next kernel is not
> > recommended. (this is in context of kexec -e though)
> > https://www.linux.com/news/reboot-racecar-kexec.
>
> This is because the filesystem is marked as mounted on-disk, and there may be
> vital data you've written but hasn't made it to the disk yet.
>
> For 'kexec -e' I think it tries to shutdown and reboot, then jumps to the new
> kernel instead of calling the firmware. This means all filesystems should be
> sync()d, umounted or at least remounted read-only.
>
> For kdump, we've already crashed, so you've already lost data. Its a best effort
> can we get to a point where you can debug the original crash.
>
>
> Thanks,
>
> James
>
More information about the linux-arm-kernel
mailing list