Questions on Interruption handling

Thu Oct 23 07:26:09 PDT 2014

Ok Keith,
I presume you meant this:
static irqreturn_t nvme_irq(int irq, void *data)
{
       irqreturn_t result;
       struct nvme_queue *nvmeq = data;
+        u32 maskvec;
+
+        /* We calculate which mask vector we use. */
+        maskvec = 1 << nvmeq->cq_vector;
+
       spin_lock(&nvmeq->q_lock);
+
+       /* We mask interrupts to this vector but not for MSI-X */
+       if(!nvmeq->dev->pci_dev->msix_enabled)
+           writel(maskvec, &nvmeq->dev->bar->intms);
+
       nvme_process_cq(nvmeq);
       result = nvmeq->cqe_seen ? IRQ_HANDLED : IRQ_NONE;
       nvmeq->cqe_seen = 0;
+
+       /* And we unmask interrupt to this vector but not for MSI-X. */
+       if(!nvmeq->dev->pci_dev->msix_enabled)
+           writel(maskvec, &nvmeq->dev->bar->intmc);
+
       spin_unlock(&nvmeq->q_lock);
       return result;
}

Ps: I don't have and MSI-X device to check if that doesn't affect it.
Could you check?

Regards,
Angelo Silva Brito.
Graduate in Engenharia da Computação - UFPE
http://about.me/angelobrito
_________________________________________________

On Wed, Oct 22, 2014 at 3:10 PM, Keith Busch <keith.busch at intel.com> wrote:
> On Wed, 22 Oct 2014, Angelo Brito wrote:
>>
>> Hello all!
>>
>> I had some issues with the Interruption handling. The scenario is as
>> follows:
>> We have a NVMe Device with single MSI enabled and some of its
>> transfers took about 1000 jiffies (ms) to execute. We saw this when we
>> used IOMeter to benchmark a NVMe controller and we noticed that about
>> 1 in 10 commands took much longer than expected. When we traced
>> through the kernel code we tracked the issue to come from the nvme_irq
>> function. In most cases, it is triggered by the interrupts and all
>> CQEs in the queue are processed correctly. In some cases, though, we
>> found out that a new CQE arrived while the nvme_irq function was
>> processing previous entries or just after the CQ doorbell has been
>> sent. These entries were overlooked by the driver and picked up later
>> by the nvme_kthread function, which reexecutes the nvme_process_cq
>> function once every second.
>>
>> We read the NVMe specification 1.1 section 7.5.1.1 and noticed that it
>> defined a procedure to not loose any interrupts in a Legacy or MSI
>> system. The Host Software Interrupt Handling should mask and clear the
>> interruptions by using the INTMS and INTMC registers. We are proposing
>> a change to the nvme_irq function as described below. This is probably
>> not needed on MSI-X enabled devices but it is harmless to leave it in
>> for them as well.
>
>
> I don't think you can't really assume it's harmless. It adds two
> unnecessary MMIO writes on every interrupt, and the spec says undefined
> results if you touch those registers when configured with MSI-x, so
> please change this to affect only MSI and INTx.
>
>
>> So we modified the nvme_irq function and tested it on our controller
>> and the problem was fixed. Below annexed is the modified code, since
>> we don't know how to submit it. Sorry about that.
>>
>> static irqreturn_t nvme_irq(int irq, void *data)
>> {
>>        irqreturn_t result;
>>        struct nvme_queue *nvmeq = data;
>> +        u32 maskvec;
>> +
>> +        /* We calculate which mask vector we use. */
>> +        maskvec = 1 << nvmeq->cq_vector;
>> +
>>        spin_lock(&nvmeq->q_lock);
>> +
>> +       /* We mask interrupts to this vector. */
>> +        writel(maskvec, &nvmeq->dev->bar->intms);
>> +
>>        nvme_process_cq(nvmeq);
>>        result = nvmeq->cqe_seen ? IRQ_HANDLED : IRQ_NONE;
>>        nvmeq->cqe_seen = 0;
>> +
>> +       /* And we unmask interrupt to this vector. */
>> +        writel(maskvec, &nvmeq->dev->bar->intmc);
>> +
>>        spin_unlock(&nvmeq->q_lock);
>>        return result;
>> }