3.9.0-rc1: kexec not working: root disk does not show up

Vivek Goyal vgoyal at redhat.com
Wed Mar 13 09:53:51 EDT 2013


On Wed, Mar 13, 2013 at 11:46:29AM +0400, Konstantin Khlebnikov wrote:

[..]
> >Ok, some more observation.
> >
> >- Problem seems to be in during shutdown path. Because older kernel 3.8
> >   can kexec into newer kernel 3.9.rc1 but not vice-a-versa.
> >
> >I did git bisecting and following commit seems to be problem.
> >
> >commit 7897e6022761ace7377f0f784fca059da55f5d71
> >Author: Konstantin Khlebnikov<khlebnikov at openvz.org>
> >Date:   Mon Feb 4 15:55:58 2013 +0400
> >
> >     PCI: Disable Bus Master unconditionally in pci_device_shutdown()
> >
> >     Commit b566a22c23 ("PCI: disable Bus Master on PCI device shutdown")
> >     used pci_disable_device(), but that doesn't disable Bus Mastering
> >     unconditionally; we allow nested enable/disable calls, and only the
> >     last disable call actually does anything.
> >
> >     This uses pci_clear_master() to unconditionally clear the Bus Master
> >     bit.
> >
> >     Matthew Garrett and Alan Cox said (see LKML link below) that clearing
> >Bus
> >     Master for all PCI devices may lead to unpredictable consequences:
> >some
> >     devices ignores this bit and continue DMA, some of them hang after
> >that or
> >     crash the whole system.  But we're already trying to clear Bus Master
> >in
> >     general because of b566a22c23; this merely deals with the cases where
> >     drivers haven't shut down the device correctly.
> >
> >     [bhelgaas: changelog]
> >     Link: https://lkml.org/lkml/2012/6/6/278
> >     Signed-off-by: Konstantin Khlebnikov<khlebnikov at openvz.org>
> >     Signed-off-by: Bjorn Helgaas<bhelgaas at google.com>
> >     Acked-by: Rafael J. Wysocki<rafael.j.wysocki at intel.com>
> >
> >I reverted above commit and things work again. Just that I get following
> >warning during shutdown.
> >
> >[   54.252516] ------------[ cut here ]------------
> >[   54.257199] WARNING: at drivers/pci/pci.c:1397
> >pci_disable_device+0x90/0xa0()
> >[   54.264387] Hardware name: HP xw6600 Workstation
> >[   54.269061] Device pci
> >disabling already-disabled device
> >[   54.274341] Modules linked in: floppy
> >[   54.278403] Pid: 5272, comm: kexec Not tainted 3.9.0-rc2+ #207
> >[   54.284289] Call Trace:
> >[   54.286801]  [<ffffffff8133c600>] ? pci_disable_device+0x60/0xa0
> >[   54.292864]  [<ffffffff8103e49f>] warn_slowpath_common+0x7f/0xc0
> >[   54.298926]  [<ffffffff8103e596>] warn_slowpath_fmt+0x46/0x50
> >[   54.304727]  [<ffffffff8133c592>] ? do_pci_disable_device+0x52/0x60
> >[   54.311050]  [<ffffffff8133c630>] pci_disable_device+0x90/0xa0
> >[   54.316938]  [<ffffffff8133e1a4>] pci_device_shutdown+0x44/0x50
> >[   54.322915]  [<ffffffff81462b2d>] device_shutdown+0x1d/0x180
> >[   54.328631]  [<ffffffff81056ba6>] kernel_restart_prepare+0x36/0x50
> >[   54.334866]  [<ffffffff810a16c0>] kernel_kexec+0x50/0x80
> >[   54.340235]  [<ffffffff81056e35>] sys_reboot+0x1f5/0x260
> >[   54.345604]  [<ffffffff811621b9>] ? mntput_no_expire+0x49/0x160
> >[   54.351578]  [<ffffffff811622f6>] ? mntput+0x26/0x40
> >[   54.356601]  [<ffffffff81144539>] ? __fput+0x1a9/0x280
> >[   54.361798]  [<ffffffff8105fae4>] ? task_work_run+0xc4/0xe0
> >[   54.367428]  [<ffffffff810029a5>] ? do_notify_resume+0x75/0x80
> >[   54.373319]  [<ffffffff81882742>] system_call_fastpath+0x16/0x1b
> >[   54.379382] ---[ end trace ea6ecbf97debf2e2 ]---
> >[   54.385157] Starting new kernel
> >
> >
> >I am leaving the logs from previous mail intact so that newly CCed
> >people can have a look at it and don't go hunting for old mail in
> >lkml archives.
> >
> >Thanks
> >Vivek
> >
> 
> Look like I fixed one bug and added another.
> After ->shutdown() device can be in D3-cold state and config space is unreachable.
> 
> try this patch
> 
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -385,6 +385,12 @@ static void pci_device_shutdown(struct device *dev)
> 
>         if (drv && drv->shutdown)
>                 drv->shutdown(pci_dev);
> +
> +       if (pci_dev->current_state == PCI_D3cold) {
> +               WARN_ON(pci_dev->msi_enabled || pci_dev->msix_enabled);
> +               return;
> +       }
> +
>         pci_msi_shutdown(pci_dev);
>         pci_msix_shutdown(pci_dev);
> 
>

Hi, 

So this patch is supposed to fix the warning? This warning showed up
only after reverting your patch. So do you agree that your original
patch should be reverted?

I applied this patch and warning is still there (After reverting your
original patch).

I thought we would first address the issue of why kexec is not working
with your patch.

Thanks
Vivek

[   38.048452] tg3 0000:0e:00.0: System wakeup enabled by ACPI
[   38.266774] sd 5:0:0:0: [sdd] Synchronizing SCSI cache
[   38.272116] sd 3:0:0:0: [sdc] Synchronizing SCSI cache
[   38.277361] sd 2:0:0:0: [sdb] Synchronizing SCSI cache
[   38.282661] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   38.288467] ------------[ cut here ]------------
[   38.293151] WARNING: at drivers/pci/pci.c:1397
pci_disable_device+0x90/0xa0()
[   38.300339] Hardware name: HP xw6600 Workstation
[   38.305014] Device pci
disabling already-disabled device
[   38.310294] Modules linked in: floppy
[   38.314356] Pid: 5258, comm: kexec Not tainted 3.9.0-rc2+ #209
[   38.320243] Call Trace:
[   38.322755]  [<ffffffff8133c600>] ? pci_disable_device+0x60/0xa0
[   38.328818]  [<ffffffff8103e49f>] warn_slowpath_common+0x7f/0xc0
[   38.334880]  [<ffffffff8103e596>] warn_slowpath_fmt+0x46/0x50
[   38.340681]  [<ffffffff8133c592>] ? do_pci_disable_device+0x52/0x60
[   38.347003]  [<ffffffff8133c630>] pci_disable_device+0x90/0xa0
[   38.352892]  [<ffffffff8133f2d4>] pci_device_shutdown+0x54/0x80
[   38.358868]  [<ffffffff81462b5d>] device_shutdown+0x1d/0x180
[   38.364584]  [<ffffffff81056ba6>] kernel_restart_prepare+0x36/0x50
[   38.370820]  [<ffffffff810a16c0>] kernel_kexec+0x50/0x80
[   38.376188]  [<ffffffff81056e35>] sys_reboot+0x1f5/0x260
[   38.381558]  [<ffffffff811621b9>] ? mntput_no_expire+0x49/0x160
[   38.387532]  [<ffffffff811622f6>] ? mntput+0x26/0x40
[   38.392555]  [<ffffffff81144539>] ? __fput+0x1a9/0x280
[   38.397753]  [<ffffffff8187a0ee>] ? _raw_spin_unlock_irq+0xe/0x30
[   38.403901]  [<ffffffff8105fae4>] ? task_work_run+0xc4/0xe0
[   38.409531]  [<ffffffff810029a5>] ? do_notify_resume+0x75/0x80
[   38.415420]  [<ffffffff81882742>] system_call_fastpath+0x16/0x1b
[   38.421479] ---[ end trace 61d35d2d55ce5d3d ]---
[   38.427241] Starting new kernel
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
 



More information about the kexec mailing list