[PATCHv2] nvme-pci: fix timeout request state check
Keith Busch
kbusch at kernel.org
Wed Jan 18 07:21:58 PST 2023
On Wed, Jan 18, 2023 at 08:33:30AM +0100, Christoph Hellwig wrote:
> On Tue, Jan 17, 2023 at 10:52:39PM -0700, Keith Busch wrote:
> > We're actually not batching here (no IOB in the timeout context), so we
> > are either:
> >
> > a. calling nvme_pci_complete_rq() inline with the cqe
> > b. racing with smp ipi or softirq
> >
> > If case (a), we will always see IDLE. If (b), we are racing and may see
> > either COMPLETED or IDLE, so we have to check that it's not either of
> > those. Since there's only one other state (STARTED) that was guaranteed
> > prior to entering the timeout handler, we can just make sure it's not
> > that one after the poll to know if abort escalation is needed.
>
> The point is still that "started" is the wrong check here and relies
> on an implementation detail. I think we're better off with an explicit
> IDLE check and a big fat comment.
So you want the check to look like this instead?
---
@@ -1362,7 +1362,8 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req)
else
nvme_poll_irqdisable(nvmeq);
- if (blk_mq_request_completed(req)) {
+ if (blk_mq_request_completed(req) ||
+ blk_mq_rq_state(req) == MQ_RQ_IDLE) {
dev_warn(dev->ctrl.device,
"I/O %d QID %d timeout, completion polled\n",
req->tag, nvmeq->qid);
--
That's essentially a more complicated equivalent to what I have, but
fine with me if you think it's more clear.
Alternatively, I also considered moving the IDLE state setting to when
the request is actually freed, which might make more sense and works
without changing the nvme driver:
---
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -713,6 +713,7 @@ static void __blk_mq_free_request(struct request *rq)
struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
const int sched_tag = rq->internal_tag;
+ WRITE_ONCE(rq->state, MQ_RQ_IDLE);
blk_crypto_free_request(rq);
blk_pm_mark_last_busy(rq);
rq->mq_hctx = NULL;
@@ -741,7 +742,6 @@ void blk_mq_free_request(struct request *rq)
rq_qos_done(q, rq);
- WRITE_ONCE(rq->state, MQ_RQ_IDLE);
if (req_ref_put_and_test(rq))
__blk_mq_free_request(rq);
}
--
More information about the Linux-nvme
mailing list