Review Request: nvme-pci: Fix multiple races in nvme_setup_io_queues()

Casey Chen cachen at purestorage.com
Tue Jun 15 14:57:40 PDT 2021


All tests and fix are based on tag nvme-5.14-2021-06-08 of repo
http://git.infradead.org/nvme.git

Testing method
while :; power off one drive;
    sleep $((RANDOM%3)).$((RANDOM%10));
    power on the same drive;
    sleep $((RANDOM%3)).$((RANDOM%10));
done

Sample crash call trace
[11668.533431] pcieport 0000:87:08.0: pciehp: Slot(402): Card present
...
[11668.681298] nvme nvme12: pci function 0000:8c:00.0
[11668.681354] nvme 0000:8c:00.0: enabling device (0100 -> 0102)
[11669.046142] pcieport 0000:87:08.0: pciehp: Slot(402): Link Down
[11669.046146] pcieport 0000:87:08.0: pciehp: Slot(402): Card not present
[11669.077428] ------------[ cut here ]------------
[11669.077431] kernel BUG at drivers/pci/msi.c:348!
[11669.077555] invalid opcode: 0000 [#1] SMP KASAN
[11669.077658] CPU: 31 PID: 716 Comm: irq/127-pciehp Not tainted 5.13.0-rc3+
[11669.077869] Hardware name: <MASKED OFF>
[11669.078022] RIP: 0010:free_msi_irqs+0x28a/0x2d0
...
[11669.093982] Call Trace:
[11669.096850]  pci_free_irq_vectors+0xe/0x20
[11669.099695]  nvme_dev_disable+0x140/0x760 [nvme]
[11669.102503]  ? _raw_spin_lock_irqsave+0x9c/0x100
[11669.105271]  ? trace_hardirqs_on+0x2c/0xe0
[11669.107994]  nvme_remove+0x191/0x1e0 [nvme]
[11669.110689]  pci_device_remove+0x6b/0x110
[11669.113316]  device_release_driver_internal+0x14f/0x280
[11669.115939]  pci_stop_bus_device+0xcb/0x100
[11669.118515]  pci_stop_and_remove_bus_device+0xe/0x20
[11669.121079]  pciehp_unconfigure_device+0xfa/0x200
[11669.123597]  ? pciehp_configure_device+0x1c0/0x1c0
[11669.126049]  ? trace_hardirqs_on+0x2c/0xe0
[11669.128444]  pciehp_disable_slot+0xc4/0x1a0
[11669.130771]  ? pciehp_runtime_suspend+0x40/0x40
[11669.133054]  ? __mutex_lock_slowpath+0x10/0x10
[11669.135289]  ? trace_hardirqs_on+0x2c/0xe0
[11669.137462]  pciehp_handle_presence_or_link_change+0x15c/0x4f0
[11669.139632]  ? down_read+0x11f/0x1a0
[11669.141731]  ? pciehp_handle_disable_request+0x80/0x80
[11669.143817]  ? rwsem_down_read_slowpath+0x600/0x600
[11669.145851]  ? __radix_tree_lookup+0xb2/0x130
[11669.147842]  pciehp_ist+0x19d/0x1a0
[11669.149790]  ? pciehp_set_indicators+0xe0/0xe0
[11669.151704]  ? irq_finalize_oneshot.part.46+0x1d0/0x1d0
[11669.153588]  irq_thread_fn+0x3f/0xa0
[11669.155407]  irq_thread+0x195/0x290
[11669.157147]  ? irq_thread_check_affinity.part.49+0xe0/0xe0
[11669.158883]  ? _raw_read_lock_irq+0x50/0x50
[11669.160611]  ? _raw_read_lock_irq+0x50/0x50
[11669.162320]  ? irq_forced_thread_fn+0xf0/0xf0
[11669.164032]  ? trace_hardirqs_on+0x2c/0xe0
[11669.165731]  ? irq_thread_check_affinity.part.49+0xe0/0xe0
[11669.167461]  kthread+0x1c8/0x1f0
[11669.169173]  ? kthread_parkme+0x40/0x40
[11669.170883]  ret_from_fork+0x22/0x30

From: Casey Chen <cachen at purestorage.com>
Reply-To: 
Subject: Review Request: nvme-pci: Fix multiple races in nvme_setup_io_queues()
In-Reply-To: 





More information about the Linux-nvme mailing list