Timer interrupt lost on some x86_64 systems
vgoyal at in.ibm.com
Sun Nov 11 23:49:03 EST 2007
On Wed, Nov 07, 2007 at 09:00:06AM -0500, Neil Horman wrote:
> Hey all-
> I've been getting reports of some x86_64 systems that, on kdump kernel
> boot get stuck in calibrate_delay(), in both RHEL kernels and upstream kernels.
> The current thinking is that the lapic timer interrupt is no longer getting
> delivered, likely because we handle a crash condition on a cpu that isn't the
> boot cpu. One known offender is this motherboard:
> My current thought is that the TIMER_LVT entry is masked on all but the boot cpu
> on this system (which is strange, as I was under the impression that the timer
> interrupt was supposed to be enabled on all CPU's nominally.
I also thought that LAPIC timer interrupts are enabled on all cpus.
> At any rate, I was
> going to try to read/write the TIMER_LVT on the crashing processor before we
> jump to purgatory, or in purgatory itself, to see if that fixes the problem, but
I think calibrate_dealy() depends on external timer interrupt coming and
not the local APIC timer interrupt. Generally it is 8254 timer chip. Now a
days motherboards seems to be having HPET and I know somebody has reported
problems with HPET where HPET interrupts are not coming in second kernel and
system hangs in second kernel. I suspect that same might be the issue here.
More information about the kexec