[RFC] nvme-rdma: Stop queues when starting with error recovery

Tue May 24 06:12:10 PDT 2022

On 5/23/22 18:21, Daniel Wagner wrote:
> When we enter error recovery we should stop all queue activities and
> all armed timers.
> 
> For example, we could arming an ANATT timer right before we enter
> error recovery but do not successfully recover before the timer
> fires. The timer is supposed only be active when the controller is in
> LIVE state hence we should call nvme_stop_ctrl when starting with the
> recover activites.

This takes me back... IIRC the main reason why the error recovery
handler in nvme-rdma (and nvme-tcp) does not start with nvme_stop_ctrl()
is because back then, nvme_stop_ctrl() could block until an
admin_timeout expired on an inflight admin command (I have a vague
memory that if cancelled/flushed the scan_work), which delayed I/O
failover, hence we just cherry-picked what needed to be stopped to make
sure I/O failover will not be blocked by anything and progress quickly.

If I look at nvme_stop_queue it does not touch the scan_work, but it
does flush ana_work which includes I/O, so it can still block on it
to fail after admin_timeout expires... also flush failfast_work is
taking the namespaces_rwsem (although read access). The fw_act_work
is probably meaningless for fabrics...

Maybe we need to have something like nvme_stop_ctrl_fast, that will just
stop the non-blocking stuff, so failover can work quickly, and then
after we have failover, we can do the full nvme_stop_ctrl() or the
reminder of nvme_stop_ctrl_fast...

thoughts?