completion timeouts with pin-based interrupts in QEMU hw/nvme
Guenter Roeck
linux at roeck-us.net
Thu Jan 12 15:57:48 PST 2023
On 1/12/23 09:45, Klaus Jensen wrote:
> On Jan 12 09:34, Keith Busch wrote:
>> On Thu, Jan 12, 2023 at 02:10:51PM +0100, Klaus Jensen wrote:
>>>
>>> The pin-based interrupt logic in hw/nvme seems sound enough to me, so I
>>> am wondering if there is something going on with the kernel driver (but
>>> I certainly do not rule out that hw/nvme is at fault here, since
>>> pin-based interrupts has also been a source of several issues in the
>>> past).
>>
>> Does it work if you change the pci_irq_assert() back to pci_irq_pulse()?
>> While probably not the "correct" thing to do, it has better results in
>> my testing.
>>
>
> A simple s/pci_irq_assert/pci_irq_pulse broke the device. However,
>
> diff --git i/hw/nvme/ctrl.c w/hw/nvme/ctrl.c
> index 03760ddeae8c..0fc46dcb9ec4 100644
> --- i/hw/nvme/ctrl.c
> +++ w/hw/nvme/ctrl.c
> @@ -477,6 +477,7 @@ static void nvme_irq_check(NvmeCtrl *n)
> return;
> }
> if (~intms & n->irq_status) {
> + pci_irq_deassert(&n->parent_obj);
> pci_irq_assert(&n->parent_obj);
> } else {
> pci_irq_deassert(&n->parent_obj);
>
>
> seems to do the trick (pulse is the other way around, assert, then
> deassert).
>
> Probably not the "correct" thing to do, but I'll take it since it seems
> to fix it. On a simple boot loop I got the timeout about 1 out of 5. I'm
> on ~20 runs now and have not encountered it.
>
> I'll see if I can set up a mips rootfs and test that. Guenter, what MIPS
> machine/board(s) are you testing?
So, for mipsel, two sets of results for the above:
First, qemu v7.2 is already much better than qemu v7.1. With qemu v7.1,
the boot test fails roughly every other test. Failure rate with qemu v7.2
is much less.
Second, my nvme boot test with qemu 7.2 fails after ~5-10 iterations.
After the above change, I did not see a single failure in 50 boot tests.
I'll test the other suggested change next.
Guenter
More information about the Linux-nvme
mailing list