[PATCH] nvme: mark ctrl as DEAD if removing from error recovery

Sagi Grimberg sagi at grimberg.me
Wed Jun 28 01:10:26 PDT 2023


>> @@ -4055,7 +4056,8 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
>>   	 * removing the namespaces' disks; fail all the queues now to avoid
>>   	 * potentially having to clean up the failed sync later.
>>   	 */
>> -	if (ctrl->state == NVME_CTRL_DEAD) {
>> +	if (ctrl->state == NVME_CTRL_DEAD ||
>> +			ctrl->old_state != NVME_CTRL_LIVE) {
> 
> Style nitpick: this is really nasty to read, pleas align things
> properly:
> 
> 	if (ctrl->state == NVME_CTRL_DEAD ||
> 	    ctrl->old_state != NVME_CTRL_LIVE) {
> 
> But I'm really worried about this completely uncodumented and fragile
> magic here.  The existing magic NVME_CTRL_DEAD is bad enough, but
> backtracking to the previous state just makes this worse.

Agree.

> I think a big problem is that the call to nvme_mark_namespaces_dead
> and nvme_unquiesce_io_queues is hidden inside of
> nvme_remove_namespaces, and I think there's no good way around
> us unwinding the currently unpair quiesce calls and fixing this
> for real.

My suggestion initially was to add a ctrl interface to say if the
transport can accept I/O: ctrl.ops.io_capable() and mark dead_unquiesce
based on that instead of the state. But I'm not sure it is better.



More information about the Linux-nvme mailing list