Questions on Interruption handling

Angelo Brito asb at cin.ufpe.br
Fri Oct 24 10:59:21 PDT 2014


Thanks for looking into it.
Simply masking and unmasking the interruptions fixed our problems but
perhaps it creates other issues.
So please keep us posted. We will watch out ECN.

Regards,
Angelo Brito


On Fri, Oct 24, 2014 at 2:49 PM, Matthew Wilcox <willy at linux.intel.com> wrote:
> On Fri, Oct 24, 2014 at 01:51:59PM -0300, Angelo Brito wrote:
>> We can look more carefuly at those functions you stated, but perhaps
>> there is a small difference on how we are reading the spec. We do not
>> send a MSI for every single CQ because the spec states a different
>> functionality in section 7.5.1. It defines that the internal IS vector
>> should have a bit high when there are unanswered CQ entries and the
>> vector is not masked. The table then states that the MSI should be
>> sent only when a bit in the IS vector rises, meaning it either had
>> entries and was unmasked or it did not have entries and an entry came
>> in. I presume that was to reduce traffic in a very overloaded system.
>> This is for MSI and legacy only, of course, MSI-X uses a different
>> mechanism.
>>
>> Now, there is a window that we noticed. After the interrupt was
>> triggered it starts reading the CQs. It takes a few hundred
>> nanoseconds from the time the CQs have been read to the time the
>> doorbell arives at the controller, and the controller will take time
>> to process it as well, probably up to a few microsencods. If the
>> controller decides to write a new entry in a CQ in this time the
>> corresponding bit in the IS vector will already be high, therefore
>> there should be no new MSI. The host though already checked the CQs so
>> it will not see that new entries came in.
>>
>> We believe that is why section 7.5.1.1 states that the host should
>> mask interrupts and then release them. This way the host forces the
>> bits in the IS vector in the controller to go low and high again (see
>> section 7.5.1). If the host did not answer every single CQ entry, then
>> when the INTMC register is written a new MSI will be issued.
>
> Argh, the spec is buggy.  It should say that if the CQ doorbell write is
> less than the controller's notion of the CQ head, that the controller
> should send another interrupt.  I've sent in a request to the NVMe
> workgroup that we do an ECN to fix this.
>



More information about the Linux-nvme mailing list