[PATCH 10/10] Use safe_apic_wait_icr_idle in __send_IPI_dest_field - x86_64
Fernando Luis Vázquez Cao
fernando at oss.ntt.co.jp
Wed Apr 25 08:52:36 EDT 2007
On Wed, 2007-04-25 at 14:33 +0200, Andi Kleen wrote:
> On Wednesday 25 April 2007 13:51:12 Fernando Luis Vázquez Cao wrote:
> > Use safe_apic_wait_icr_idle to check ICR idle bit if the vector is
> > NMI_VECTOR to avoid potential hangups in the event of crash when kdump
> > tries to stop the other CPUs.
> But what happens then when this fails? Won't this give another hang?
> Have you tested this?
In kdump the crashing CPU (i.e. the CPU that called crash_kexec) is the
one in charge of rebooting into and executing the dump capture kernel.
But before doing this it attempts to stop the other CPUs sending a IPI
using NMI_VECTOR as the vector. The problem is that sometimes delivery
seems to fail and the crashing CPU gets stuck waiting for the ICR status
bit to be cleared, which will never happen.
With this patch, when safe_apic_wait_icr_idle times out the CPU will
continue executing and try to hand over control to the dump capture
kernel as usual. After applying this patch I have not seen hangs in the
reboot path to second kernel showing the symptoms mentioned before, but
perhaps I am just being lucky and there is something else going on.
More information about the kexec