Timer interrupt lost on some x86_64 systems
Eric W. Biederman
ebiederm at xmission.com
Thu Nov 22 15:04:38 EST 2007
Neil Horman <nhorman at redhat.com> writes:
> On Wed, Nov 14, 2007 at 12:09:39PM +0530, Vivek Goyal wrote:
>> On Tue, Nov 13, 2007 at 09:33:30AM -0500, Neil Horman wrote:
>> > > In the past I have found issues with interrupt routing on IOPAPIC and
>> > > interrupt lockup on LAPIC. But these issues are already solved. I would
>> > > also think of priting LAPIC and IOAPIC entries to see how timer interrupt
>> > > routing changes from first kernel to second.
>> > >
>> > I recently read the ioapic section in the opteron processor guide and noted
>> > ioapic routing field in the config registers, so I'll be looking at that.
>> > also not that in the failing case on the systems in question the boot cpu is
>> > _not_ the cpu that boots the kdump kernel, and its APIC ID is 1 not 0, IIRC
>> Failing on non-boot cpu should not be an issue. I had fixed an issue in the
>> past where non-boot cpu was not receiving the timer interrupts because of
>> IOAPIC settings where timer interrupts were always routed to boot cpu (cpu0).
>> Now it has been modified and while going down we determine which cpu we
>> are crashing on and setup IOAPIC entry accordingly. See disable_IO_APIC().
> I see the call to it in machine_crash_shutdown, but for whatever reason, it
> doesn't seem to be having the desired effect in this case....hmmmmm...
I don't know if anything has happened. However a lot of this looks like
going back to the current todo list item of getting the kernel to come up
initially in ioapic mode.
That simultaneously removes the need for machine_kexec to reprogram interrupts
in virtual wire mode and it should ultimately simplify and make more robust
irq initialization. At the very least reducing the amount of magic in
early irq processing.
More information about the kexec