[PATCH] nvme: Revert: Fix controller creation races with teardown flow
Sagi Grimberg
sagi at grimberg.me
Fri Aug 28 19:59:45 EDT 2020
>> This is indeed a regression.
>>
>> Perhaps we should also revert:
>> 12a0b6622107 ("nvme: don't hold nvmf_transports_rwsem for more than
>> transport lookups")
>>
>> Which inherently caused this by removing the serialization of
>> .create_ctrl()...
>
> no, I believe the patch on the semaphore is correct. Otherwise - things
> can be blocked a long time.. a minute (1 cmd timeout) or even multiple
> minutes in the case where a command failure in core layers effectively
> gets ignored and thus doesn't cause the error path in the transport.
> There can be multiple /dev/nvme-fabrics commands stacked up that can
> make the delays look much longer to the last guy.
>
> as far as creation vs teardown... yeah, not fun, but there are other
> ways to deal with it. FC: I got rid of the separate create/reconnect
> threads a while ago thus the return-control-while-reconnecting behavior,
> so I've had to deal with it. It's one area it'd be nice to see some
> convergence in implementation again between transports.
Doesn't fc have a bug there? in create_ctrl after flushing the
connect_work, what is telling it if delete is running in with it
(or that it already ran...)
More information about the Linux-nvme
mailing list