do_IRQ: 5.33 No irq handler for vector
Keith Busch
keith.busch at intel.com
Tue Jan 23 14:16:32 PST 2018
On Tue, Jan 23, 2018 at 04:16:48PM +0800, jianchao.wang wrote:
> Hi all
>
> I got the log following:
> [ 446.908030] do_IRQ: 5.33 No irq handler for vector
>
> When did the following test:
> loop fio job
> size=256m
> rw=randread
> bs=4k
> ioengine=libaio
> iodepth=64
> direct=1
> numjobs=16
> filename=/dev/nvme0n1
>
> and
>
> while true
> do
> echo 1 > /sys/block/nvme0n1/device/reset_controller
> sleep 1
> done
>
> The 33 is the vector used by nvmeq2~nvme8 (8 cpus on my machine)
>
> When the error log is printed out, the reset_work is sleeping at
> nvme_dev_disable
> ->nvme_disable_io_queues
> -> wait_for_completion_io_timeout
>
> In theory, the irq should have been masked by
> nvme_suspend_queue
> -> pci_free_irq
> -> __free_irq //if no other irq_action
> -> irq_shutdown
> -> __irq_disable
> -> mask_irq
> -> pci_msi_mask_irq
>
> Why it is still there ?
The message most likely indicates there is no struct irq_desc associated
with the vector on this CPU.
Even if the device happens to emit an MSI after we call pci_free_irq, we
haven't disabled MSI at this point, so the struct irq_desc should still
exist, even if disabled. Now it looks like this calls stack will get to:
__irq_domain_deactivate_irq
x86_vector_deactivate
clear_irq_vector
Which sets the vectors desc to VECTOR_UNUSED, or NULL. Maybe we should
disable the controller before freeing the irqs. We free the irq's first
because we were tying that to mean a quiesced queue, but that was before
we had a way to quiesce blk-mq.
More information about the Linux-nvme
mailing list