kdump: need help with kexec -p

AKASHI Takahiro takahiro.akashi at linaro.org
Fri Oct 13 01:36:54 PDT 2017


On Thu, Oct 12, 2017 at 12:40:35PM +0100, James Morse wrote:
> Hi Prabhakar,
> 
> (+CC: Akashi Takahiro, who wrote the arm64 kdump support)

Thanks.

> On 11/10/17 10:11, Prabhakar Kushwaha wrote:
> > We are facing some issues while using  kexec -p on ARM64 NXP platforms. 
> >
> > 1) After calling kexec -p, if immediately "panic" is triggered the crash kernel
> > does not boot. If we run few commands and wait for atleast (20-30 secs), before
> > triggering the panic, the crash kernel boots.
> 
> What kernel version do you see this on?

Now I know, from his private e-mail, that this only happens
on lsk(Linaro Stable Kernel) 4.4, to which I also backported my dump :)

So, first, I would like to determine whether this issue is really
lsk-specific or not.

Thanks,
-Takahiro AKASHI

> Can you log the kernel output in each
> case, (do you get a 'bye' message even when the new kernel doesn't boot).
> 
> Does 'kexec -p' report success in both cases? ($? == 0)
> 
> 
> kdump can take many seconds in purgatory, it checksums the kdump image to check
> it didn't get corrupted between 'kexec -p' and crash time, but it doesn't sound
> like this is what you're seeing.
> 
> 
> > 2) We do not see the issue ("1" ), when we do umount -a, before calling the panic
> > after kexec-p.
> 
> What filesystems (ext4, nfs etc) do you have mounted, and which ones does
> 'umount -a' get rid of?
> Where are these filesystems stored?
> 
> How many CPUs does your platform have?
> 
> (...does crashing on a different CPU change the behaviour?)
> > taskset -c 1 bash -c "echo c > /proc/sysrq-trigger"
> 
> 
> > The issue does not seem to pertain to the NXP software it seems.   (because this
> > observation has been observed on very simple kernel, where most of the
> > controllers have been removed from device tree).
> 
> > Also found some info related to this on  internet where it is mentioned that
> > without un-mounting the mounted filesystems, the boot of next kernel is not
> > recommended. (this is in context of kexec -e though)
> > https://www.linux.com/news/reboot-racecar-kexec.
> 
> This is because the filesystem is marked as mounted on-disk, and there may be
> vital data you've written but hasn't made it to the disk yet.
> 
> For 'kexec -e' I think it tries to shutdown and reboot, then jumps to the new
> kernel instead of calling the firmware. This means all filesystems should be
> sync()d, umounted or at least remounted read-only.
> 
> For kdump, we've already crashed, so you've already lost data. Its a best effort
> can we get to a point where you can debug the original crash.
> 
> 
> Thanks,
> 
> James
> 



More information about the linux-arm-kernel mailing list