[PATCH V2 0/3] *** Implement the NVMe reservation feature ***

Sagi Grimberg sagi at grimberg.me
Mon Jan 15 01:51:52 PST 2024



On 1/14/24 11:23, Guixin Liu wrote:
> Hi guys,
>      I've implemented the NVMe reservation feature. Please review it, all
> comments are welcome as usual.
> 
> Changes from v1 to v2:
> - Implement the reservation notification report, includes registration
> preempted, reservation released and reservation preempted.
>    And also handle the reservation log page avaliable event and send get
> reservation log page command to clear log page at host.
> 
> - Put the reservation check access after validate opcode. And remove
> opcodes which nvmet not implement yet check.
>    Now there is no admin opcode nvmet implemented needs reservation check,
> so I dont add reservation check to admin command path.
>    Next we need to do reservation check includes the situation of nsid is
> 0xffffffff at each admin command path, if it is needed.
> 
> - Add reservation commands support in nvmet_get_cmd_effects_nvm().
> 
> - From Chaitanya, change the local variable tree style to make it cleaner,
> and add some comments about NVMe spec.
>    And also change others advice from chaitanya.
> 
> - Put the nvmet_pr_check_cmd_access and nvmet_parse_pr_cmd into reservation
> enable check warp.
> 
> - Remove kmem_cache instead to use kmalloc and kfree.
> 
> - Change others advice from Sagi.
> 
> - Add a blktest test case, this patch will be sent before these series of
> pathes.
> 
> Advice I dont adopt:
> - From Sagi, use rcu instead of rwlock.
>    I need protect registrant_list, holder, type and generation simutaneously,
> the rcu can not protect multiple units.

I don't see how the generation needs the same protection.
As for the type and holder, perhaps we can find ordering that
allows us to relax the dereference? Or make it a member of the holder?

All I'm saying is that nvmet_pr_check_cmd_access() is dereferencing
ns->pr->holder and ns->pr->holder->uuid, so the proposal is to update
ns->pr->hostder with rcu synchronization such that the IO path will
simply do:

	rcu_read_lock();
	if (!rcu_dereference(pr->holder))
		goto unlock;
	if (uuid_equal(&ctrl->hostid, &pr->holder->hostid)
		goto unlock;
	rtype = pr->rtype; // or pr->holder->rtype ?
	rcu_read_unlock();

Nothing prevents you from protecting anything else on top of that.

> 
> - From Sagi, use blkdev's pr_ops if it is exist.
>    The backend is unable to distinguish connections from the frontend, If we
> use the pr_ops of the block device, the block device's target would only
> recognize it as a reservation by the nvme target.
>    Could you plz give me more information? Sagi?

Umm, won't nvmet register the same key as transferred by the host? Not
sure either how it should work.

In any event fwiw, if we can make the data path access cheap enough with
rcu I don't think its a necessary optimization...

> 
> - From Keith, use nvmet_get_cmd_effects_nvm to see if a command would violate
> a reservation.
>    The information is missing in effects, for example the flush command should
> be checked, but there is onely a  NVME_CMD_EFFECTS_CSUPP flag in effects, no
> LBCC.
> 
> Guixin Liu (3):
>    nvmet: support reservation feature
>    nvmet: unify aer type enum
>    nvme: introduce pr_work to handle resv event
> 
>   drivers/nvme/host/core.c        |  47 +-
>   drivers/nvme/host/nvme.h        |   1 +
>   drivers/nvme/target/Makefile    |   2 +-
>   drivers/nvme/target/admin-cmd.c |  14 +-
>   drivers/nvme/target/configfs.c  |  27 +
>   drivers/nvme/target/core.c      |  53 +-
>   drivers/nvme/target/discovery.c |   2 +-
>   drivers/nvme/target/nvmet.h     |  33 ++
>   drivers/nvme/target/pr.c        | 887 ++++++++++++++++++++++++++++++++
>   include/linux/nvme.h            |  54 +-
>   10 files changed, 1103 insertions(+), 17 deletions(-)
>   create mode 100644 drivers/nvme/target/pr.c
> 



More information about the Linux-nvme mailing list