[PATCH v7 1/1] nvmet: support reservation feature

Sagi Grimberg sagi at grimberg.me
Fri Mar 8 02:07:52 PST 2024



On 08/03/2024 11:15, Guixin Liu wrote:
>
>> unlike abort, preempt-and-abort needs a semantic guarantee because 
>> the consumer
>> may rely on this for fencing purposes. So it cannot be supported in 
>> "best effort" I think.
>>
>> A possible implementation would be not to abort as there is no such 
>> interface, but
>> nvmet may wait for all pending ns IO to complete and disallowing new 
>> IO to come in
>> (using percpu_ref_kill and percpu_ref_resurrect on ns->ref). This 
>> won't work very efficiently
>> withALL_REGS reservations though.
>
> Hi Sagi,
>
> I found that if we return an error when the call to 
> percpu_ref_tryget_live(&ns->ref) fails,
>
> it might cause hosts that still have permissions to interrupt their 
> IO. Additionally,
>
> preempt_and_abort itself holds an ns->ref, we cannot wait the ref to 
> become to zero.
>
> The solution I can think of is to add a "per-namespace" percpu_ref to 
> the controller for
>
> counting IO issued to a particular namespace by that controller. Then, 
> during the execution
>
> of preempt_and_abort, we wait for the count of those preempted and 
> unregistered controllers
>
> to drop to zero.

Yes, that is what I had in mind as well. Obviously the ns->ref cannot be 
used for this purpose.

>
> The nsid is user-specified, so we can not use array to store the 
> per-namespace percpu_ref,
>
> this will increase lookup overhead if we use xarray.

Yes, that is tricky to get right.

>
> What do you think Sagi? Or may be we can declare that 
> preempt_and_abort is not supported, just
>
> like SPDK does.

It can definitely come incrementally, but at the very least it should be 
incorrectly supported.

Out of curiosity, doesn't your use-case need a fencing protection 
against inflight I/Os reordering during
preemption?



More information about the Linux-nvme mailing list