[PATCH v1 0/4] Add command id quirk for fabrics
Sagi Grimberg
sagi at grimberg.me
Fri Nov 12 08:07:35 PST 2021
>> Completely disagree here. The TCP original report was just an example of
>> lack of protection we have against spurious completions. Nothing
>> specific about nvme-tcp here, this was discussed and agreed on in
>> the original report.
>>
> You are ignoring the facts:
>
> 1. The device that broke the spec in the first place was that device for
> which caused you to add the gen bits to CID.
Correct.
> 2. These gen bits are causing the limit of 4K Q_depth.
Correct. Which should be more than enough.
> 3. It's not mention anywhere in the spec, and if it was intended to be
> implemented like it's now - it would have mentioned in the spec.
As Keith mentioned, the spec doesn't define anything about how the host
should set the command id, hence it is free to do what it wants. The
spec doesn't need to say "the host can set the command id field however
it sees fit". See Keith comment on the field endianess.
> 4. Since gen bits were introduced, other devices got broken (such as
> Apple), hence the quirk for PCI.
You mean it _exposed_ an already broken device. Yes, that is correct.
> 5. The gen bits adds "if" conditions and logic to the fast path for
> "innosent" transports.
There is nothing transport specific about any of this, so I'm not sure
I understand what you are talking about.
> 6. This series just extends this quirk for fabrics.
I don't think the patchset got rejected, the ask afaict was to document
known broken controllers - exactly like pci quirks.
Here is the original question from Keith:
"Are there really fabrics targets behaving this way, or is this series
anticipating they might exist?"
I don't think there is any desire to keep any controller that got the
spec wrong in this particular case unusable.
> 7. Even if not broken, some devices may suffer from reduced performance
> having CID space spanning all 16 possible bit - fact that we ignore
Not sure exactly what you mean here. Again, if the controller assumes
anything on how the host would populate the command id it is doing
something wrong. It's not a theoretical argument, this controller
already does not work with existing at least one host implementation.
So you are correct, we don't take it into consideration.
More information about the Linux-nvme
mailing list