[PATCH v15 2/2] nvmet: support reservation feature

Guixin Liu kanie at linux.alibaba.com
Wed Oct 16 19:10:02 PDT 2024


在 2024/10/16 14:59, Shinichiro Kawasaki 写道:
> On Oct 14, 2024 / 14:10, Guixin Liu wrote:
>> This patch implements the reservation feature, including:
>>    1. reservation register(register, unregister and replace).
>>    2. reservation acquire(acquire, preempt, preempt and abort).
>>    3. reservation release(release and clear).
>>    4. reservation report.
>>    5. set feature and get feature of reservation notify mask.
>>    6. get log page of reservation event.
>>
>> Not supported:
>>    1. persistent reservation through power loss.
>>
>> Test cases:
>>    Use nvme-cli and fio to test all implemented sub features:
>>    1. use nvme resv-register to register host a registrant or
>>       unregister or replace a new key.
>>    2. use nvme resv-acquire to set host to the holder, and use fio
>>       to send read and write io in all reservation type. And also
>>       test preempt and "preempt and abort".
>>    3. use nvme resv-report to show all registrants and reservation
>>       status.
>>    4. use nvme resv-release to release all registrants.
>>    5. use nvme get-log to get events generated by the preceding
>>       operations.
>>
>> In addition, make reservation configurable, one can set ns to
>> support reservation before enable ns. The default of resv_enable
>> is false.
> Hello Guixin. To test the blktests patches for the reservation feature, I
> applied this patch series on top of v6.1-rc3 kernel with lockdep enabled. When I
> ran the blktests test case nvme/004, I observed the kernel message "INFO: trying
> to register non-static key" [1]. I think the call trace indicates that the
> nvmet_ctrl_destroy_pr() calls xa_erase() for uninitialized ns->pr_per_ctrl_refs.
>
> [...]
>
>> +void nvmet_ctrl_destroy_pr(struct nvmet_ctrl *ctrl)
>> +{
>> +	struct nvmet_pr_per_ctrl_ref *pc_ref;
>> +	struct nvmet_ns *ns;
>> +	unsigned long idx;
>> +
>> +	kfifo_free(&ctrl->pr_log_mgr.log_queue);
>> +	mutex_destroy(&ctrl->pr_log_mgr.lock);
>> +
>> +	xa_for_each(&ctrl->subsys->namespaces, idx, ns) {
>                 if (!ns->pr.enable)
>                         continue;
>
> I added the two lines above here, and the INFO message disappeared. Please
> check if this change makes sense.

Yeah, I miss pr is not enable here, I will fix this and send a v16.

Thanks very much.

Best Regards,

Guixin Liu

>> +		pc_ref = xa_erase(&ns->pr_per_ctrl_refs, ctrl->cntlid);
>> +		if (pc_ref)
>> +			percpu_ref_exit(&pc_ref->ref);
>> +		kfree(pc_ref);
>> +	}
>> +}
>
> [1]
>
> [   39.842700] [   T1002] run blktests nvme/004 at 2024-10-16 15:24:52
> [   39.901831] [   T1046] loop0: detected capacity change from 0 to 2097152
> [   39.920817] [   T1049] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
> [   40.001028] [    T107] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
> [   40.005448] [   T1056] nvme nvme1: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port devices.
> [   40.007399] [   T1056] nvme nvme1: creating 4 I/O queues.
> [   40.009800] [   T1056] nvme nvme1: new ctrl: "blktests-subsystem-1"
> [   40.178428] [   T1078] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1"
> [   40.286197] [   T1078] INFO: trying to register non-static key.
> [   40.287630] [   T1078] The code is fine but needs lockdep annotation, or maybe
> [   40.288482] [   T1078] you didn't initialize this object before use?
> [   40.288930] [   T1078] turning off the locking correctness validator.
> [   40.289382] [   T1078] CPU: 3 UID: 0 PID: 1078 Comm: nvme Not tainted 6.12.0-rc3+ #338
> [   40.289942] [   T1078] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
> [   40.290624] [   T1078] Call Trace:
> [   40.290866] [   T1078]  <TASK>
> [   40.291078] [   T1078]  dump_stack_lvl+0x6a/0x90
> [   40.291414] [   T1078]  register_lock_class+0xe2a/0x10a0
> [   40.291790] [   T1078]  ? __lock_acquire+0xd1b/0x5f20
> [   40.292212] [   T1078]  ? __pfx_register_lock_class+0x10/0x10
> [   40.292619] [   T1078]  __lock_acquire+0x81e/0x5f20
> [   40.292970] [   T1078]  ? lock_is_held_type+0xd5/0x130
> [   40.293331] [   T1078]  ? find_held_lock+0x2d/0x110
> [   40.293679] [   T1078]  ? __pfx___lock_acquire+0x10/0x10
> [   40.294053] [   T1078]  ? lock_release+0x460/0x7a0
> [   40.294389] [   T1078]  ? __pfx_lock_release+0x10/0x10
> [   40.294752] [   T1078]  lock_acquire.part.0+0x12d/0x360
> [   40.295118] [   T1078]  ? xa_erase+0xd/0x30
> [   40.295412] [   T1078]  ? __pfx_lock_acquire.part.0+0x10/0x10
> [   40.295818] [   T1078]  ? rcu_is_watching+0x11/0xb0
> [   40.296161] [   T1078]  ? trace_lock_acquire+0x12f/0x1a0
> [   40.296531] [   T1078]  ? __pfx___flush_work+0x10/0x10
> [   40.296895] [   T1078]  ? xa_erase+0xd/0x30
> [   40.297187] [   T1078]  ? lock_acquire+0x2d/0xc0
> [   40.297509] [   T1078]  ? xa_erase+0xd/0x30
> [   40.297805] [   T1078]  _raw_spin_lock+0x2f/0x40
> [   40.298130] [   T1078]  ? xa_erase+0xd/0x30
> [   40.298421] [   T1078]  xa_erase+0xd/0x30
> [   40.298705] [   T1078]  nvmet_ctrl_destroy_pr+0x10e/0x1c0 [nvmet]
> [   40.299148] [   T1078]  ? rcu_is_watching+0x11/0xb0
> [   40.299492] [   T1078]  ? __pfx_nvmet_ctrl_destroy_pr+0x10/0x10 [nvmet]
> [   40.300676] [   T1078]  ? __pfx___might_resched+0x10/0x10
> [   40.301908] [   T1078]  nvmet_ctrl_free+0x2f0/0x830 [nvmet]
> [   40.303178] [   T1078]  ? lockdep_hardirqs_on+0x78/0x100
> [   40.304462] [   T1078]  ? __pfx_nvmet_ctrl_free+0x10/0x10 [nvmet]
> [   40.305771] [   T1078]  ? __pfx___cancel_work+0x10/0x10
> [   40.306974] [   T1078]  ? kfree+0x13e/0x4a0
> [   40.308051] [   T1078]  nvmet_sq_destroy+0x1f2/0x3a0 [nvmet]
> [   40.309248] [   T1078]  ? __pfx_sysfs_kf_write+0x10/0x10
> [   40.310396] [   T1078]  nvme_loop_destroy_admin_queue+0x6b/0x90 [nvme_loop]
> [   40.311701] [   T1078]  nvme_do_delete_ctrl+0x149/0x160 [nvme_core]
> [   40.312949] [   T1078]  nvme_delete_ctrl_sync.cold+0x8/0xd [nvme_core]
> [   40.314194] [   T1078]  nvme_sysfs_delete+0x92/0xb0 [nvme_core]
> [   40.315355] [   T1078]  kernfs_fop_write_iter+0x39e/0x5a0
> [   40.316466] [   T1078]  vfs_write+0x5e1/0xe70
> [   40.317475] [   T1078]  ? __pfx_vfs_write+0x10/0x10
> [   40.318497] [   T1078]  ? lockdep_hardirqs_on+0x78/0x100
> [   40.319550] [   T1078]  ? __call_rcu_common.constprop.0+0x345/0xed0
> [   40.320697] [   T1078]  ? __pfx___call_rcu_common.constprop.0+0x10/0x10
> [   40.321870] [   T1078]  ksys_write+0xf7/0x1d0
> [   40.322828] [   T1078]  ? __pfx_ksys_write+0x10/0x10
> [   40.323842] [   T1078]  do_syscall_64+0x93/0x180
> [   40.324814] [   T1078]  ? do_syscall_64+0x9f/0x180
> [   40.325788] [   T1078]  ? lockdep_hardirqs_on_prepare+0x16d/0x400
> [   40.326875] [   T1078]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   40.327958] [   T1078] RIP: 0033:0x7fdf35a1d984
> [   40.328906] [   T1078] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d c5 06 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
> [   40.331891] [   T1078] RSP: 002b:00007ffce23c3ed8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> [   40.333291] [   T1078] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fdf35a1d984
> [   40.334651] [   T1078] RDX: 0000000000000001 RSI: 00007fdf35b40ed1 RDI: 0000000000000003
> [   40.336058] [   T1078] RBP: 00007fdf35b40ed1 R08: 0000000000000200 R09: 00000000ffffffff
> [   40.337426] [   T1078] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000037510610
> [   40.338802] [   T1078] R13: 00007ffce23c56c5 R14: 0000000037510610 R15: 0000000037510eb0
> [   40.340178] [   T1078]  </TASK>



More information about the Linux-nvme mailing list