[PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

James Smart james.smart at broadcom.com
Thu Jan 18 07:34:59 PST 2018


Jianchao,

This looks very coherent to me. Thank You.

-- james



On 1/18/2018 2:10 AM, Jianchao Wang wrote:
> Hello
>
> Please consider the following scenario.
> nvme_reset_ctrl
>    -> set state to RESETTING
>    -> queue reset_work
>      (scheduling)
> nvme_reset_work
>    -> nvme_dev_disable
>      -> quiesce queues
>      -> nvme_cancel_request
>         on outstanding requests
> -------------------------------_boundary_
>    -> nvme initializing (issue request on adminq)
>
> Before the _boundary_, not only quiesce the queues, but only cancel
> all the outstanding requests.
>
> A request could expire when the ctrl state is RESETTING.
>   - If the timeout occur before the _boundary_, the expired requests
>     are from the previous work.
>   - Otherwise, the expired requests are from the controller initializing
>     procedure, such as sending cq/sq create commands to adminq to setup
>     io queues.
> In current implementation, nvme_timeout cannot identify the _boundary_
> so only handles second case above.
>
> In fact, after Sagi's commit (nvme-rdma: fix concurrent reset and
> reconnect), both nvme-fc/rdma have following pattern:
> RESETTING    - quiesce blk-mq queues, teardown and delete queues/
>                 connections, clear out outstanding IO requests...
> RECONNECTING - establish new queues/connections and some other
>                 initializing things.
> Introduce RECONNECTING to nvme-pci transport to do the same mark
> Then we get a coherent state definition among nvme pci/rdma/fc
> transports and nvme_timeout could identify the _boundary_.
>
> V5:
>   - discard RESET_PREPARE and introduce RESETTING into nvme-pci
>   - change the 1st patch's name and comment
>   - other misc changes
>
> V4:
>   - rebase patches on Jens' for-next
>   - let RESETTING equal to RECONNECTING in terms of work procedure
>   - change the 1st patch's name and comment
>   - other misc changes
>
> V3:
>   - fix wrong reference in loop.c
>   - other misc changes
>
> V2:
>   - split NVME_CTRL_RESETTING into NVME_CTRL_RESET_PREPARE and
>     NVME_CTRL_RESETTING. Introduce new patch based on this.
>   - distinguish the requests based on the new state in nvme_timeout
>   - change comments of patch
>
> drivers/nvme/host/core.c |  2 +-
> drivers/nvme/host/pci.c  | 43 ++++++++++++++++++++++++++++++++-----------
> 2 files changed, 33 insertions(+), 12 deletions(-)
>
> Thanks
> Jianchao




More information about the Linux-nvme mailing list