[PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path
Neil Horman
nhorman at tuxdriver.com
Fri Feb 8 11:14:22 EST 2008
On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote:
>
> * Neil Horman <nhorman at tuxdriver.com> wrote:
>
> > Ingo noted a few posts down the nmi_exit doesn't actually write to the
> > APIC EOI register, so yeah, I agree, its bogus (and I apologize, I
> > should have checked that more carefully). Nevertheless, this patch
> > consistently allowed a hangning machine to boot through an Nmi lockup.
> > So I'm forced to wonder whats going on then that this patch helps
> > with. perhaps its a just a very fragile timing issue, I'll need to
> > look more closely.
>
> try a dummy iret, something like:
>
> asm volatile ("pushf; push $1f; iret; 1: \n");
>
> to get the CPU out of its 'nested NMI' state. (totally untested)
>
> the idea is to push down an iret frame to the kernel stack that will
> just jump to the next instruction and gets it out of the NMI nesting.
> Note: interrupts will/must still be disabled, despite the iret. (the
> ordering of the pushes might be wrong, we might need more than that for
> a valid iret, etc. etc.)
>
> Ingo
Just tried this experiment and it met with success. Executing a dummy iret
instruction got us to boot the kdump kernel successfully.
Thoughts on how we should handle this from here?
Regards
Neil
--
/****************************************************
* Neil Horman <nhorman at tuxdriver.com>
* Software Engineer, Red Hat
****************************************************/
More information about the kexec
mailing list