[PATCH V2 0/3] *** Implement the NVMe reservation feature ***
Guixin Liu
kanie at linux.alibaba.com
Mon Jan 15 18:29:15 PST 2024
在 2024/1/15 17:51, Sagi Grimberg 写道:
>
>
> On 1/14/24 11:23, Guixin Liu wrote:
>> Hi guys,
>> I've implemented the NVMe reservation feature. Please review it,
>> all
>> comments are welcome as usual.
>>
>> Changes from v1 to v2:
>> - Implement the reservation notification report, includes registration
>> preempted, reservation released and reservation preempted.
>> And also handle the reservation log page avaliable event and send get
>> reservation log page command to clear log page at host.
>>
>> - Put the reservation check access after validate opcode. And remove
>> opcodes which nvmet not implement yet check.
>> Now there is no admin opcode nvmet implemented needs reservation
>> check,
>> so I dont add reservation check to admin command path.
>> Next we need to do reservation check includes the situation of
>> nsid is
>> 0xffffffff at each admin command path, if it is needed.
>>
>> - Add reservation commands support in nvmet_get_cmd_effects_nvm().
>>
>> - From Chaitanya, change the local variable tree style to make it
>> cleaner,
>> and add some comments about NVMe spec.
>> And also change others advice from chaitanya.
>>
>> - Put the nvmet_pr_check_cmd_access and nvmet_parse_pr_cmd into
>> reservation
>> enable check warp.
>>
>> - Remove kmem_cache instead to use kmalloc and kfree.
>>
>> - Change others advice from Sagi.
>>
>> - Add a blktest test case, this patch will be sent before these
>> series of
>> pathes.
>>
>> Advice I dont adopt:
>> - From Sagi, use rcu instead of rwlock.
>> I need protect registrant_list, holder, type and generation
>> simutaneously,
>> the rcu can not protect multiple units.
>
> I don't see how the generation needs the same protection.
> As for the type and holder, perhaps we can find ordering that
> allows us to relax the dereference? Or make it a member of the holder?
>
> All I'm saying is that nvmet_pr_check_cmd_access() is dereferencing
> ns->pr->holder and ns->pr->holder->uuid, so the proposal is to update
> ns->pr->hostder with rcu synchronization such that the IO path will
> simply do:
>
> rcu_read_lock();
> if (!rcu_dereference(pr->holder))
> goto unlock;
> if (uuid_equal(&ctrl->hostid, &pr->holder->hostid)
> goto unlock;
> rtype = pr->rtype; // or pr->holder->rtype ?
> rcu_read_unlock();
>
> Nothing prevents you from protecting anything else on top of that.
>
>>
>> - From Sagi, use blkdev's pr_ops if it is exist.
>> The backend is unable to distinguish connections from the
>> frontend, If we
>> use the pr_ops of the block device, the block device's target would only
>> recognize it as a reservation by the nvme target.
>> Could you plz give me more information? Sagi?
>
> Umm, won't nvmet register the same key as transferred by the host? Not
> sure either how it should work.
I think the person who manages the nvmet configuration should make sure
that nvmet has exclusive access to the backend disk.
>
> In any event fwiw, if we can make the data path access cheap enough with
> rcu I don't think its a necessary optimization...
OK, I will do a deep thinking about rcu.
Thanks.
Guixin Liu
>
>>
>> - From Keith, use nvmet_get_cmd_effects_nvm to see if a command would
>> violate
>> a reservation.
>> The information is missing in effects, for example the flush
>> command should
>> be checked, but there is onely a NVME_CMD_EFFECTS_CSUPP flag in
>> effects, no
>> LBCC.
>>
>> Guixin Liu (3):
>> nvmet: support reservation feature
>> nvmet: unify aer type enum
>> nvme: introduce pr_work to handle resv event
>>
>> drivers/nvme/host/core.c | 47 +-
>> drivers/nvme/host/nvme.h | 1 +
>> drivers/nvme/target/Makefile | 2 +-
>> drivers/nvme/target/admin-cmd.c | 14 +-
>> drivers/nvme/target/configfs.c | 27 +
>> drivers/nvme/target/core.c | 53 +-
>> drivers/nvme/target/discovery.c | 2 +-
>> drivers/nvme/target/nvmet.h | 33 ++
>> drivers/nvme/target/pr.c | 887 ++++++++++++++++++++++++++++++++
>> include/linux/nvme.h | 54 +-
>> 10 files changed, 1103 insertions(+), 17 deletions(-)
>> create mode 100644 drivers/nvme/target/pr.c
>>
More information about the Linux-nvme
mailing list