[PATCH] nvme: Revert: Fix controller creation races with teardown flow

Sagi Grimberg sagi at grimberg.me
Mon Aug 31 19:15:22 EDT 2020


>>>> This is indeed a regression.
>>>>
>>>> Perhaps we should also revert:
>>>> 12a0b6622107 ("nvme: don't hold nvmf_transports_rwsem for more than 
>>>> transport lookups")
>>>>
>>>> Which inherently caused this by removing the serialization of
>>>> .create_ctrl()...
>>>
>>> no, I believe the patch on the semaphore is correct. Otherwise - 
>>> things can be blocked a long time.. a minute (1 cmd timeout) or even 
>>> multiple minutes in the case where a command failure in core layers 
>>> effectively gets ignored and thus doesn't cause the error path in the 
>>> transport. There can be multiple /dev/nvme-fabrics commands stacked 
>>> up that can make the delays look much longer to the last guy.
>>>
>>> as far as creation vs teardown... yeah, not fun, but there are other 
>>> ways to deal with it. FC: I got rid of the separate create/reconnect 
>>> threads a while ago thus the return-control-while-reconnecting 
>>> behavior, so I've had to deal with it.  It's one area it'd be nice to 
>>> see some convergence in implementation again between transports.
>>
>> Doesn't fc have a bug there? in create_ctrl after flushing the
>> connect_work, what is telling it if delete is running in with it
>> (or that it already ran...)
> 
> I guess I don't understand what the bug is you are thinking about. Maybe 
> there's a short period that the ctrl ptr is perhaps freed, thus the 
> pointer shouldn't be used - but I don't see it as almost everything is 
> simply looking at  the value of the pointer, not dereferencing it.

I'm referring to nvme_fc_init_ctrl, if delete happens while it
is waiting in flush_delayed_work(&ctrl->connect_work); won't you
dereference and return a controller that is possibly already
deleted/freed?

> I do have a bug or two  with delete association fighting with 
> create_association - but it's mainly due to nvme_fc_error_recovery not 
> the delete routine. I've reworked this area after seeing your other 
> patches and will be posting after some more testing.  But no reason for 
> synchronizing all ctrl creates.

Is it that big of an issue? it should fail rather quickly shouldn't it?



More information about the Linux-nvme mailing list