[PATCH 3/3] nvme: code command_id with a genctr for use-after-free validation

Mon May 17 12:09:46 PDT 2021

On 5/17/21 10:59 AM, Sagi Grimberg wrote:
> We cannot detect a (perhaps buggy) controller that is sending us
> a completion for a request that was already completed (for example
> sending a completion twice), this phenomenon was seen in the wild
> a few times.
> 
> So to protect against this, we use the upper 4 msbits of the nvme sqe
> command_id to use as a 4-bit generation counter and verify it matches
> the existing request generation that is incrementing on every execution.
> 
> The 16-bit command_id structure now is constructed by:
> | xxxx | xxxxxxxxxxxx |
>   gen    request tag
> 
> This means that we are giving up some possible queue depth as 12 bits
> allow for a maximum queue depth of 4095 instead of 65536, however we
> never create such long queues anyways so no real harm done.

Is a four bit generation counter sufficient? Shouldn't such a counter be
at least 32 bits to provide a reasonable protection against e.g.
duplicate packets generated by retry mechanisms in networking stacks?

Additionally, I do not agree with the statement "we never create such
long queues anyways". I have already done this myself.

Thanks,

Bart.