Bug Report: can't unload nvme module in case of disabled device

Keith Busch keith.busch at intel.com
Thu Aug 10 12:36:15 PDT 2017


On Thu, Aug 10, 2017 at 08:04:13PM +0300, Max Gurtovoy wrote:
> 
> I'm using PCIe ctrl.
> Using 4.13-rc4+ I couldn't even run easier scenario of only unloading the
> nvme module (with SAMSUNG MZPLL1T6HEHP-00003 and Intel P3500/3700 devices):
> 
> [  369.997917] INFO: task modprobe:3709 blocked for more than 120 seconds.
> [  370.005215]       Not tainted 4.13.0-rc4+ #21
> [  370.010017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  370.018647] modprobe        D    0  3709   3654 0x00000000
> [  370.024695] Call Trace:
> [  370.027400]  __schedule+0x1dc/0x780
> [  370.031261]  schedule+0x36/0x80
> [  370.034756]  blk_mq_freeze_queue_wait+0x4b/0xb0
> [  370.039750]  ? remove_wait_queue+0x60/0x60
> [  370.044263]  blk_freeze_queue+0x1a/0x20
> [  370.048489]  blk_cleanup_queue+0x7f/0x150
> [  370.052927]  nvme_dev_remove_admin+0x36/0x50 [nvme]
> [  370.058303]  nvme_remove+0xa2/0x130 [nvme]
> [  370.062820]  pci_device_remove+0x39/0xc0
> [  370.067142]  device_release_driver_internal+0x141/0x200
> [  370.072898]  driver_detach+0x3f/0x80
> [  370.076852]  bus_remove_driver+0x55/0xd0
> [  370.081186]  driver_unregister+0x2c/0x50
> [  370.085521]  pci_unregister_driver+0x2a/0xa0
> [  370.090227]  nvme_exit+0x10/0xb84 [nvme]
> [  370.094562]  SyS_delete_module+0x171/0x250
> [  370.099101]  ? exit_to_usermode_loop+0x5e/0x88
> [  370.103996]  entry_SYSCALL_64_fastpath+0x1a/0xa5
> [  370.109096] RIP: 0033:0x7f146b5106b7
> [  370.113037] RSP: 002b:00007ffd2cae12e8 EFLAGS: 00000206 ORIG_RAX:
> 00000000000000b0
> [  370.121431] RAX: ffffffffffffffda RBX: 0000000000000003 RCX:
> 00007f146b5106b7
> [  370.129295] RDX: 0000000000000000 RSI: 0000000000000800 RDI:
> 000000000223f5e8
> [  370.137167] RBP: 000000000223f580 R08: 00007f146b7d5060 R09:
> 00007f146b580a40
> [  370.145029] R10: 00007ffd2cae1070 R11: 0000000000000206 R12:
> 00007ffd2cae0310
> [  370.152890] R13: 0000000000000000 R14: 000000000223f5e8 R15:
> 0000000000000000
> 
> the new scenario:
> 1. modprobe nvme
> 2. sleep 10
> 3. modprobe -r nvme
> 
> works on 4.11.0/4.12.0 but not on 4.13.0-rc4+.

This I'm not able to reproduce. The stack trace is saying there are
entered requests on the admin queue, but that shouldn't be possible at
this point in nvme_remove. I'll keep looking.



More information about the Linux-nvme mailing list