Questions about kexec-tools (resend to list)

Pratyush Anand panand at redhat.com
Tue Mar 7 06:53:09 PST 2017


Hi Philip,

On Sunday 05 March 2017 04:56 AM, Philip Prindeville wrote:

[...]

>
> In the case of having a single system kernel binary, then you’d have to install this kernel and it’s modules, and add this kernel to the boot loader configuration files, wouldn’t you?  What do my grub arguments look like?

Not necessarily all the modules. Kdump kernel will use only minimal 
modules. You can build your initramfs with a minimum needed module, so 
that you can boot and copy vmcore.

>
> Do I always load my system kernel with “crashkernel=64M at 16M” per the “CONFIG_PHYSICAL_START” and here:

In the first kernel you need to pass "crashkernel=". Only size(64M 
)should also work. Kernel should find the appropriate start address of 
crash kernel location.

>
>
>> 2) Boot the system kernel with the boot parameter "crashkernel=Y at X",
>>  where Y specifies how much memory to reserve for the dump-capture kernel
>>  and X specifies the beginning of this reserved memory. For example,
>>  "crashkernel=64M at 16M" tells the system kernel to reserve 64 MB of memory
>>  starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>
>
>
> Okay, we have a 2.6MB /vmlinuz in our /boot partition, so it’s relocatable and this part applies:
>
>
>> If you are using a compressed bzImage/vmlinuz, then use following command
>> to load dump-capture kernel.
>>
>>  kexec -p <dump-capture-kernel-bzImage> \
>>  --initrd=<initrd-for-dump-capture-kernel> \
>>  --append="root=<root-dev> <arch-specific-options>"
>
>
>
> Not sure I understand this part.  So if we have a relocatable kernel with crashdump built-in to our system kernel, do we need to load two kernels, just with different <arch-specific-options> and everything else being the same?

You are in primary kernel and you need to load crash kernel.

`kexec -p /boot/vmlinuz --initrd=/boot/kdump-initrd --reuse-cmdline 
--append="irqpoll maxcpus=1 reset_devices"`  should work.

You need to prepare kdump-initrd, OR you can use current initrd, but 
that will load all your modules of 1st kernel and 64M might not be 
sufficient space then.

>
> Would the <arch-specific-options> be:
>
> crashkernel=64M at 16M 1 irqpoll maxcpus=1 reset_devices

"crashkernel=" *must* *not* be passed to crash kernel. It is only for 
the primary kernel.

>
> in that case?
>
> On a normally running system, using an overlay root, our cmdline looks like:
>
> BOOT_IMAGE=/boot/vmlinuz block2mtd.block2mtd=/dev/sda2,65536,rootfs,5 root=/dev/mtdblock0 rootfstype=squashfs rootwait console=tty0 console=ttyS0,115200n8r noinitrd

So, it should also have crashkernel=64M.

>
>
> so I guess we’d just mash on those extra arguments.  On a running system, our mount points are:
>
> /dev/root on /rom type squashfs (ro,relatime)
> proc on /proc type proc (rw,nosuid,nodev,noexec,noatime)
> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime)
> tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noatime)
> tmpfs on /tmp/root type tmpfs (rw,noatime,mode=755)
> tmpfs on /dev type tmpfs (rw,nosuid,relatime,size=512k,mode=755)
> devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,mode=600)
> debugfs on /sys/kernel/debug type debugfs (rw,noatime)
> /dev/mtdblock1 on /overlay type jffs2 (rw,noatime)
> overlayfs:/overlay on / type overlay (rw,noatime,lowerdir=/,upperdir=/overlay/upper,workdir=/overlay/work)
>
>
> but it doesn’t sound like any of that would change (except perhaps mounting a USB thumb-drive if we wanted to copy our crashdump to that device instead).
>
> So if I’ve understood, when the first loaded kernel (the system kernel) crashes, kexec will then try the next kernel it sees…  which will be something like:
>
> kexec -p /boot/vmlinuz \
> 	—-append=“$(cat /proc/cmdline) irqpoll maxcpus=1 reset_devices 1”
>
> (we don’t use a initrd as you can see above) and that’s described here:

OK..so you can exclude --initrd argument to kexec.

>
>
>> Kernel Panic
>> ============
>>
>> After successfully loading the dump-capture kernel as previously
>> described, the system will reboot into the dump-capture kernel if a
>> system crash is triggered. [snip]
>
>
>
> assuming the system isn’t so badly hosed that a WDT expires causing a BIOS reset, etc.
>
> Do both kernels use the same “crashdump=“ value, or do they need different base addresses?

Again, only 1st kernel need "crashkernel=".

>
> And assuming that you’re using the same kernel, etc. how does the init.d scripting on the crashdump (2nd instance of the kernel) know that it’s not the nominal kernel?  Do we use /sys/kernel/kexec_loaded for this purpose?  Or do we just look for the existence of /proc/vmcore?

Yep, you can find /proc/vmcore in 2nd kernel but not in 1st kernel.
/sys/kernel/kexec_crash_loaded  should have 1 in 1st kernel while 0 in 
crash kernel.

>
> And then have something in my init.d scripts like:
>
> kexec_loaded=$(< /sys/kernel/kexec_loaded)

/sys/kernel/kexec_crash_loaded

>
> if [ “$kexec_loaded” = 0 ]; then
>   kexec -p /boot/vmlinuz \
> 	—-append=“$(cat /proc/cmdline) irqpoll maxcpus=1 reset_devices 1”
> else
>   echo “*** HANDLING CRASH DUMP COLLECTION"
>   mkdir -p /mnt/crashdrive
>   mount LABEL=crashdrive /mnt/crashdrive
>   # might do something clever here with “df —output=avail -m /mnt/crashdrive” to make
>   # sure I have enough space for the copy, perhaps deleting older dumps until I do…
>   cp /proc/vmcore /mnt/crashdrive
>   sync
>   umount /mnt/crashdrive
>   echo “*** NOW REBOOTING"
>   reboot -f
> fi
>

Above should work.

There can be many ways. You can have a look on fedora kexec-tools code.
http://pkgs.fedoraproject.org/cgit/rpms/kexec-tools.git/


> Do I need to reboot in a particular way to avoid looping?  The “Kernel Panic” section seems to state that normal reboots won’t be affected.

When you execute reboot, it will reboot to the 1st kernel through grub 
(boot loader).

>
> I appreciate the documentation you’ve written, but it’s a little unclear (to me at least) how to handle the degenerate case of using the same kernel as the system kernel and the crashdump kernel…
>
> I want to make sure that I don’t inadvertently set it up to do looping infinitely nested kernels, etc.
>
> I’m probably overthinking this, but… we’re having crashes in the field and the customers are a little riled up right now so I don’t want to spend a lot of time saying “here try this image”.  They want their smoking gun and they want it soon.
>


~Pratyush



More information about the kexec mailing list