[PATCH 1/4] nvme: fix delete uninitialized controller
Sagi Grimberg
sagi at grimberg.me
Tue Jan 3 02:30:45 PST 2023
> nvme-fabric controllers can be deleted by
> /sys/class/nvme/nvme<NS>/delete_controller
> echo 1 > /sys/class/nvme/nvme<NS>/delete_controller
> The above command will call nvme_delete_ctrl_sync().
> This function internally tries to change ctrl->state to NVME_CTRL_DELETING.
> NVME_CTRL_LIVE, NVME_CTRL_RESETTING, and NVME_CTRL_CONNECTING states can
> be changed to NVME_CTRL_DELETING.
> If the state is successfully changed, nvme_do_delete_ctrl() is called,
> which is the actual delete logic of controller.
>
> controller initialization logic changes ctrl->state.
> NEW -> CONNECTING -> LIVE.
> NVME_CTRL_CONNECTING state doesn't ensure that initialization is done.
>
> So, delete logic can be called before the finish of controller
> initialization.
> So kernel panic would occur because nvme_do_delete_ctrl() dereferences
> uninitialized values.
>
> BUG: KASAN: null-ptr-deref in do_raw_spin_trylock+0x67/0x180
> Read of size 4 at addr 00000000000000c0 by task bash/928
>
> CPU: 7 PID: 928 Comm: bash Not tainted 6.1.0 #35
> nvme nvme0: Connect command failed: host path error
> Call Trace:
> <TASK>
> dump_stack_lvl+0x57/0x81
> ? do_raw_spin_trylock+0x67/0x180
> kasan_report+0xba/0x1f0
> nvme nvme0: failed to connect queue: 0 ret=880
> ? do_raw_spin_trylock+0x67/0x180
> ? sysfs_file_ops+0x170/0x170
> kasan_check_range+0x14a/0x1a0
> do_raw_spin_trylock+0x67/0x180
> ? do_raw_spin_lock+0x270/0x270
> ? nvme_remove_namespaces+0x1bc/0x3d0
> _raw_spin_lock_irqsave+0x4b/0x90
> ? blk_mq_quiesce_queue+0x1b/0x160
> blk_mq_quiesce_queue+0x1b/0x160
> nvme_tcp_delete_ctrl+0x4b/0x70
> nvme_do_delete_ctrl+0x135/0x141
> nvme_sysfs_delete.cold+0x8/0xd
> kernfs_fop_write_iter+0x34b/0x520
> vfs_write+0x83a/0xd20
> ? kernel_write+0x630/0x630
> ? rcu_read_lock_sched_held+0x12/0x80
> ? lock_acquire+0x4f4/0x630
> ? __fget_light+0x51/0x230
> ksys_write+0xf9/0x1d0
> ? __ia32_sys_read+0xa0/0xa0
> ? syscall_enter_from_user_mode+0x1d/0x50
> do_syscall_64+0x3b/0x90
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fc955d10104
>
> Fixes: 1a353d85b02d ("nvme: add fabrics sysfs attributes")
> Signed-off-by: Taehee Yoo <ap420073 at gmail.com>
> ---
> drivers/nvme/host/core.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index d307ae4d8a57..cd4c80ca66d4 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -243,7 +243,8 @@ static void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl)
> * since ->delete_ctrl can free the controller.
> */
> nvme_get_ctrl(ctrl);
> - if (nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING))
> + if (test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags) &&
> + nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING))
> nvme_do_delete_ctrl(ctrl);
So what is the outcome now? if the controller kept on dangling? what
triggers the controller deletion?
> nvme_put_ctrl(ctrl);
> }
I don't think this is the correct approach.
the delete should fully fence the initialization and then delete
the controller.
In this case, the transport driver should not quiesce a non-existent
queue.
If further synchronization is needed, then it should be added so that
delete will fully fence the initialization.
More information about the Linux-nvme
mailing list