do_IRQ: 5.33 No irq handler for vector

Keith Busch keith.busch at intel.com
Tue Jan 23 14:16:32 PST 2018


On Tue, Jan 23, 2018 at 04:16:48PM +0800, jianchao.wang wrote:
> Hi all
> 
> I got the log following:
> [  446.908030] do_IRQ: 5.33 No irq handler for vector
> 
> When did the following test:
> loop fio job
> size=256m
> rw=randread
> bs=4k
> ioengine=libaio
> iodepth=64
> direct=1
> numjobs=16
> filename=/dev/nvme0n1
> 
> and
>  
> while true
> do
>     echo 1 > /sys/block/nvme0n1/device/reset_controller 
>     sleep 1
> done
> 
> The 33 is the vector used by nvmeq2~nvme8 (8 cpus on my machine)
> 
> When the error log is printed out,  the reset_work is sleeping at
> nvme_dev_disable
>   ->nvme_disable_io_queues
>     -> wait_for_completion_io_timeout
> 
> In theory, the irq should have been masked by 
> nvme_suspend_queue
>   -> pci_free_irq
>     -> __free_irq //if no other irq_action
>       -> irq_shutdown
>         -> __irq_disable
>           -> mask_irq
>             -> pci_msi_mask_irq
> 
> Why it is still there ?

The message most likely indicates there is no struct irq_desc associated
with the vector on this CPU.

Even if the device happens to emit an MSI after we call pci_free_irq, we
haven't disabled MSI at this point, so the struct irq_desc should still
exist, even if disabled. Now it looks like this calls stack will get to:

  __irq_domain_deactivate_irq
    x86_vector_deactivate
      clear_irq_vector

Which sets the vectors desc to VECTOR_UNUSED, or NULL. Maybe we should
disable the controller before freeing the irqs. We free the irq's first
because we were tying that to mean a quiesced queue, but that was before
we had a way to quiesce blk-mq.



More information about the Linux-nvme mailing list