[PATCH V2 0/3] *** Implement the NVMe reservation feature ***

Guixin Liu kanie at linux.alibaba.com
Mon Jan 15 18:29:15 PST 2024


在 2024/1/15 17:51, Sagi Grimberg 写道:
>
>
> On 1/14/24 11:23, Guixin Liu wrote:
>> Hi guys,
>>      I've implemented the NVMe reservation feature. Please review it, 
>> all
>> comments are welcome as usual.
>>
>> Changes from v1 to v2:
>> - Implement the reservation notification report, includes registration
>> preempted, reservation released and reservation preempted.
>>    And also handle the reservation log page avaliable event and send get
>> reservation log page command to clear log page at host.
>>
>> - Put the reservation check access after validate opcode. And remove
>> opcodes which nvmet not implement yet check.
>>    Now there is no admin opcode nvmet implemented needs reservation 
>> check,
>> so I dont add reservation check to admin command path.
>>    Next we need to do reservation check includes the situation of 
>> nsid is
>> 0xffffffff at each admin command path, if it is needed.
>>
>> - Add reservation commands support in nvmet_get_cmd_effects_nvm().
>>
>> - From Chaitanya, change the local variable tree style to make it 
>> cleaner,
>> and add some comments about NVMe spec.
>>    And also change others advice from chaitanya.
>>
>> - Put the nvmet_pr_check_cmd_access and nvmet_parse_pr_cmd into 
>> reservation
>> enable check warp.
>>
>> - Remove kmem_cache instead to use kmalloc and kfree.
>>
>> - Change others advice from Sagi.
>>
>> - Add a blktest test case, this patch will be sent before these 
>> series of
>> pathes.
>>
>> Advice I dont adopt:
>> - From Sagi, use rcu instead of rwlock.
>>    I need protect registrant_list, holder, type and generation 
>> simutaneously,
>> the rcu can not protect multiple units.
>
> I don't see how the generation needs the same protection.
> As for the type and holder, perhaps we can find ordering that
> allows us to relax the dereference? Or make it a member of the holder?
>
> All I'm saying is that nvmet_pr_check_cmd_access() is dereferencing
> ns->pr->holder and ns->pr->holder->uuid, so the proposal is to update
> ns->pr->hostder with rcu synchronization such that the IO path will
> simply do:
>
>     rcu_read_lock();
>     if (!rcu_dereference(pr->holder))
>         goto unlock;
>     if (uuid_equal(&ctrl->hostid, &pr->holder->hostid)
>         goto unlock;
>     rtype = pr->rtype; // or pr->holder->rtype ?
>     rcu_read_unlock();
>
> Nothing prevents you from protecting anything else on top of that.
>
>>
>> - From Sagi, use blkdev's pr_ops if it is exist.
>>    The backend is unable to distinguish connections from the 
>> frontend, If we
>> use the pr_ops of the block device, the block device's target would only
>> recognize it as a reservation by the nvme target.
>>    Could you plz give me more information? Sagi?
>
> Umm, won't nvmet register the same key as transferred by the host? Not
> sure either how it should work.

I think the person who manages the nvmet configuration should make sure

that nvmet has exclusive access to the backend disk.

>
> In any event fwiw, if we can make the data path access cheap enough with
> rcu I don't think its a necessary optimization...

OK, I will do a deep thinking about rcu.

Thanks.

Guixin Liu

>
>>
>> - From Keith, use nvmet_get_cmd_effects_nvm to see if a command would 
>> violate
>> a reservation.
>>    The information is missing in effects, for example the flush 
>> command should
>> be checked, but there is onely a  NVME_CMD_EFFECTS_CSUPP flag in 
>> effects, no
>> LBCC.
>>
>> Guixin Liu (3):
>>    nvmet: support reservation feature
>>    nvmet: unify aer type enum
>>    nvme: introduce pr_work to handle resv event
>>
>>   drivers/nvme/host/core.c        |  47 +-
>>   drivers/nvme/host/nvme.h        |   1 +
>>   drivers/nvme/target/Makefile    |   2 +-
>>   drivers/nvme/target/admin-cmd.c |  14 +-
>>   drivers/nvme/target/configfs.c  |  27 +
>>   drivers/nvme/target/core.c      |  53 +-
>>   drivers/nvme/target/discovery.c |   2 +-
>>   drivers/nvme/target/nvmet.h     |  33 ++
>>   drivers/nvme/target/pr.c        | 887 ++++++++++++++++++++++++++++++++
>>   include/linux/nvme.h            |  54 +-
>>   10 files changed, 1103 insertions(+), 17 deletions(-)
>>   create mode 100644 drivers/nvme/target/pr.c
>>



More information about the Linux-nvme mailing list