[PATCHv10 0/9] Xen: extend kexec hypercall for use with pv-ops kernels

Daniel Kiper daniel.kiper at oracle.com
Thu Nov 7 16:16:51 EST 2013


On Wed, Nov 06, 2013 at 02:49:37PM +0000, David Vrabel wrote:
> The series (for Xen 4.4) improves the kexec hypercall by making Xen
> responsible for loading and relocating the image.  This allows kexec
> to be usable by pv-ops kernels and should allow kexec to be usable
> from a HVM or PVH privileged domain.
>
> I have now tested this with a Linux kernel image using the VGA console
> which was what was causing problems in v9 (this turned out to be a
> kexec-tools bug).
>
> The required patch series for kexec-tools will be posted shortly and
> are available from the xen-v7 branch of:

In general it works. However, quite often I am not able to execute panic
kernel. Machine hangs with following message:

(XEN) Domain 0 crashed: Executing crash image

gdb shows:

(gdb) bt
#0  0xffff82d0801a0092 in do_nmi_crash (regs=<optimized out>) at crash.c:113
#1  0xffff82d0802281d9 in nmi_crash () at entry.S:666
#2  0x0000000000000000 in ?? ()
(gdb)

Especially second bt line scares me... ;-)))

I have not been able to identify why NMI was activated because
stack is completely cleared. I tried to record execution in gdb
but it stops with following message:

cpumask_clear_cpu (dstp=0xffff82d0802f7f78 <call_data+24>, cpu=0)
    at /srv/dev/xen/xen_20130413_20131107.kexec/xen/include/xen/cpumask.h:108
108             clear_bit(cpumask_check(cpu), dstp->bits);
Process record: failed to record execution log.

Do you know how to find out why NMI was activated?

I am able almost always reproduce this issue doing this:
  - boot Xen,
  - load panic kernel,
  - echo c > /proc/sysrq-trigger,
  - reboot from command line,
  - boot Xen,
  - load panic kernel,
  - echo c > /proc/sysrq-trigger.

Additionally, my compiler fails because it detects unused result
variable in xen/common/kimage.c:kimage_crash_alloc().

Daniel



More information about the kexec mailing list