Review Request: nvme-pci: Fix multiple races in nvme_setup_io_queues()
Casey Chen
cachen at purestorage.com
Tue Jun 15 14:57:40 PDT 2021
All tests and fix are based on tag nvme-5.14-2021-06-08 of repo
http://git.infradead.org/nvme.git
Testing method
while :; power off one drive;
sleep $((RANDOM%3)).$((RANDOM%10));
power on the same drive;
sleep $((RANDOM%3)).$((RANDOM%10));
done
Sample crash call trace
[11668.533431] pcieport 0000:87:08.0: pciehp: Slot(402): Card present
...
[11668.681298] nvme nvme12: pci function 0000:8c:00.0
[11668.681354] nvme 0000:8c:00.0: enabling device (0100 -> 0102)
[11669.046142] pcieport 0000:87:08.0: pciehp: Slot(402): Link Down
[11669.046146] pcieport 0000:87:08.0: pciehp: Slot(402): Card not present
[11669.077428] ------------[ cut here ]------------
[11669.077431] kernel BUG at drivers/pci/msi.c:348!
[11669.077555] invalid opcode: 0000 [#1] SMP KASAN
[11669.077658] CPU: 31 PID: 716 Comm: irq/127-pciehp Not tainted 5.13.0-rc3+
[11669.077869] Hardware name: <MASKED OFF>
[11669.078022] RIP: 0010:free_msi_irqs+0x28a/0x2d0
...
[11669.093982] Call Trace:
[11669.096850] pci_free_irq_vectors+0xe/0x20
[11669.099695] nvme_dev_disable+0x140/0x760 [nvme]
[11669.102503] ? _raw_spin_lock_irqsave+0x9c/0x100
[11669.105271] ? trace_hardirqs_on+0x2c/0xe0
[11669.107994] nvme_remove+0x191/0x1e0 [nvme]
[11669.110689] pci_device_remove+0x6b/0x110
[11669.113316] device_release_driver_internal+0x14f/0x280
[11669.115939] pci_stop_bus_device+0xcb/0x100
[11669.118515] pci_stop_and_remove_bus_device+0xe/0x20
[11669.121079] pciehp_unconfigure_device+0xfa/0x200
[11669.123597] ? pciehp_configure_device+0x1c0/0x1c0
[11669.126049] ? trace_hardirqs_on+0x2c/0xe0
[11669.128444] pciehp_disable_slot+0xc4/0x1a0
[11669.130771] ? pciehp_runtime_suspend+0x40/0x40
[11669.133054] ? __mutex_lock_slowpath+0x10/0x10
[11669.135289] ? trace_hardirqs_on+0x2c/0xe0
[11669.137462] pciehp_handle_presence_or_link_change+0x15c/0x4f0
[11669.139632] ? down_read+0x11f/0x1a0
[11669.141731] ? pciehp_handle_disable_request+0x80/0x80
[11669.143817] ? rwsem_down_read_slowpath+0x600/0x600
[11669.145851] ? __radix_tree_lookup+0xb2/0x130
[11669.147842] pciehp_ist+0x19d/0x1a0
[11669.149790] ? pciehp_set_indicators+0xe0/0xe0
[11669.151704] ? irq_finalize_oneshot.part.46+0x1d0/0x1d0
[11669.153588] irq_thread_fn+0x3f/0xa0
[11669.155407] irq_thread+0x195/0x290
[11669.157147] ? irq_thread_check_affinity.part.49+0xe0/0xe0
[11669.158883] ? _raw_read_lock_irq+0x50/0x50
[11669.160611] ? _raw_read_lock_irq+0x50/0x50
[11669.162320] ? irq_forced_thread_fn+0xf0/0xf0
[11669.164032] ? trace_hardirqs_on+0x2c/0xe0
[11669.165731] ? irq_thread_check_affinity.part.49+0xe0/0xe0
[11669.167461] kthread+0x1c8/0x1f0
[11669.169173] ? kthread_parkme+0x40/0x40
[11669.170883] ret_from_fork+0x22/0x30
From: Casey Chen <cachen at purestorage.com>
Reply-To:
Subject: Review Request: nvme-pci: Fix multiple races in nvme_setup_io_queues()
In-Reply-To:
More information about the Linux-nvme
mailing list