completion timeouts with pin-based interrupts in QEMU hw/nvme

Thu Jan 12 16:27:18 PST 2023

On 1/12/23 11:27, Keith Busch wrote:
> On Thu, Jan 12, 2023 at 06:45:55PM +0100, Klaus Jensen wrote:
>> On Jan 12 09:34, Keith Busch wrote:
>>> On Thu, Jan 12, 2023 at 02:10:51PM +0100, Klaus Jensen wrote:
>>>>
>>>> The pin-based interrupt logic in hw/nvme seems sound enough to me, so I
>>>> am wondering if there is something going on with the kernel driver (but
>>>> I certainly do not rule out that hw/nvme is at fault here, since
>>>> pin-based interrupts has also been a source of several issues in the
>>>> past).
>>>
>>> Does it work if you change the pci_irq_assert() back to pci_irq_pulse()?
>>> While probably not the "correct" thing to do, it has better results in
>>> my testing.
>>>
>>
>> A simple s/pci_irq_assert/pci_irq_pulse broke the device. However,
>>
>> 	diff --git i/hw/nvme/ctrl.c w/hw/nvme/ctrl.c
>> 	index 03760ddeae8c..0fc46dcb9ec4 100644
>> 	--- i/hw/nvme/ctrl.c
>> 	+++ w/hw/nvme/ctrl.c
>> 	@@ -477,6 +477,7 @@ static void nvme_irq_check(NvmeCtrl *n)
>> 		 return;
>> 	     }
>> 	     if (~intms & n->irq_status) {
>> 	+        pci_irq_deassert(&n->parent_obj);
>> 		 pci_irq_assert(&n->parent_obj);
>> 	     } else {
>> 		 pci_irq_deassert(&n->parent_obj);
>>
>>
>> seems to do the trick (pulse is the other way around, assert, then
>> deassert).
>>
>> Probably not the "correct" thing to do, but I'll take it since it seems
>> to fix it. On a simple boot loop I got the timeout about 1 out of 5. I'm
>> on ~20 runs now and have not encountered it.
>>
>> I'll see if I can set up a mips rootfs and test that. Guenter, what MIPS
>> machine/board(s) are you testing?
> 
> Could you try the below?
> 

This works as well: 50 iterations with no failures after applying the patch
below on top of qemu v7.2.0. Tested with qemu-system-mipsel.

Guenter

> ---
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index 2c85de4700..521c3c80c1 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -558,6 +558,20 @@ static void nvme_irq_deassert(NvmeCtrl *n, NvmeCQueue *cq)
>       }
>   }
>   
> +static void nvme_irq_pulse(NvmeCtrl *n, NvmeCQueue *cq)
> +{
> +    if (!cq->irq_enabled) {
> +        return;
> +    }
> +
> +    if (msix_enabled(&(n->parent_obj))) {
> +        msix_notify(&(n->parent_obj), cq->vector);
> +        return;
> +    }
> +
> +    pci_irq_pulse(&n->parent_obj);
> +}
> +
>   static void nvme_req_clear(NvmeRequest *req)
>   {
>       req->ns = NULL;
> @@ -6917,6 +6931,12 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
>               }
>   
>               nvme_irq_deassert(n, cq);
> +        } else {
> +            /*
> +             * Retrigger the irq just to make sure the host has no excuse for
> +             * not knowing there's more work to complete on this CQ.
> +             */
> +            nvme_irq_pulse(n, cq);
>           }
>       } else {
>           /* Submission queue doorbell write */
> --