Problem with kexec on i386, linux-3.5
Eric W. Biederman
ebiederm at xmission.com
Thu Aug 16 15:22:46 EDT 2012
Christian Schaubschläger <christian.schaubschlaeger at gmx.at> writes:
>> That is a tricky issue. Sometimes the slightest things can set
>> something like this off.
>> Somewhere someone changed something in one of the drivers that made it
>> so that the hardware winds up in a state the int 13 disk driver does not
>> like it after kexec.
>> If you want to track this down I would recommend a bisect between 3.4
>> and 3.5-rc1 to see which change breaks your setup.
> I bistcted that down to this patch:
> commit b566a22c23327f18ce941ffad0ca907e50a53d41
> Author: Khalid Aziz <khalid.aziz at hp.com>
> Date: Fri Apr 27 13:00:33 2012 -0600
> PCI: disable Bus Master on PCI device shutdown
> Disable Bus Master bit on the device in pci_device_shutdown() to ensure PCI
> devices do not continue to DMA data after shutdown. This can cause memory
> corruption in case of a kexec where the current kernel shuts down and
> transfers control to a new kernel while a PCI device continues to DMA to
> memory that does not belong to it any more in the new kernel.
> I have tested this code on two laptops, two workstations and a 16-socket
> server. kexec worked correctly on all of them.
> Signed-off-by: Khalid Aziz <khalid.aziz at hp.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas at google.com>
> Without this patch, int13 works fine here! If anyone needs more
> information, just let me know!
Which leads to an interesting conundrum.
kexec appears to be more reliable for booting another kernel with this
patch applied. This patch does kill the entier use case of making BIOS
calls, and I suspect it also does nasty things to alpha bootloaders.
My gut feel is that the trampoline code should reenable bus mastering
on the devices that lie behind int13, but I don't know how practical
that suggestion is in reality.
More information about the kexec