[PATCH 2/2] nvme: avoid possible double completions for the same request
Ye Bin
yebin at huaweicloud.com
Mon Jun 8 04:34:25 PDT 2026
From: Ye Bin <yebin10 at huawei.com>
Multiple instances of null pointer references have occurred in the
production environment due to the same request reporting "done" twice
when hardware failures happen. The exception stack trace information
is as follows:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Mem abort info:
ESR = 0x96000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=00000020ce32d000
[0000000000000000] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 0000000096000004 [#1] SMP
CPU: 18 PID: 577 Comm: kworker/18:1H Kdump: loaded Tainted: G O K 5.10.0-182.aarch64 #1
Workqueue: kblockd blk_mq_requeue_work
pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--)
pc : cc+0x60/0x1a0
lr : blk_mq_request_bypass_insert+0x40/0x1a0
sp : ffff8001046fbcf0
x29: ffff8001046fbcf0 x28: 0000000000000000
x27: ffff800103aa3c48 x26: ffff800101b4ac10
x25: ffff0027fff49005 x24: 0000000000000088
x23: 0000000000000000 x22: 0000000000000000
x21: ffff00208c86bc40 x20: ffff00208c86bc40
x19: 0000000000000000 x18: 0000000000000000
x17: 0000000000000000 x16: ffff800100db2c70
x15: 0000aaaaca58e4d0 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000eb0
x9 : ffff800100640420 x8 : fefefefefefefeff
x7 : 0000000000000018 x6 : ffff00208a700d74
x5 : 00646b636f6c626b x4 : 0000000000000000
x3 : 0000000000000000 x2 : 0000000000000001
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
blk_mq_request_bypass_insert+0x60/0x1a0
blk_mq_requeue_work+0x154/0x2b0
process_one_work+0x1d8/0x4cc
worker_thread+0x158/0x410
kthread+0x108/0x134
ret_from_fork+0x10/0x18
To avoid the preceding problem, the NVME_REQ_COMPLETE flag is added by
referring to the implementation of scsi commit f1342709d18a ("scsi: Do not
rely on blk-mq for double completions"). The flag is set in the interrupt
entry processing to prevent the same I/O from being executed twice.
Signed-off-by: Ye Bin <yebin10 at huawei.com>
---
drivers/nvme/host/core.c | 6 ++++++
drivers/nvme/host/nvme.h | 3 +++
2 files changed, 9 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c06f18dc5a65..d162a713a00c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -532,6 +532,10 @@ bool nvme_cancel_request(struct request *req, void *data)
if (blk_mq_rq_state(req) != MQ_RQ_IN_FLIGHT)
return true;
+ if (unlikely(test_and_set_bit(NVME_REQ_COMPLETE,
+ &nvme_req(req)->flags)))
+ return true;
+
nvme_req(req)->status = NVME_SC_HOST_ABORTED_CMD;
set_bit(NVME_REQ_CANCELLED, &nvme_req(req)->flags);
blk_mq_complete_request(req);
@@ -1091,6 +1095,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
if (!(req->rq_flags & RQF_DONTPREP))
nvme_clear_nvme_request(req);
+ else
+ clear_bit(NVME_REQ_COMPLETE, &nvme_req(req)->flags);
switch (req_op(req)) {
case REQ_OP_DRV_IN:
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 1059976b413b..a60ffb6f463f 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -261,6 +261,7 @@ enum {
NVME_REQ_USERCMD = 1,
NVME_MPATH_IO_STATS = 2,
NVME_MPATH_CNT_ACTIVE = 3,
+ NVME_REQ_COMPLETE = 4,
};
static inline struct nvme_request *nvme_req(struct request *req)
@@ -807,6 +808,8 @@ static inline bool nvme_try_complete_req(struct request *req, __le16 status,
nvme_should_fail(req);
if (unlikely(blk_should_fake_timeout(req->q)))
return true;
+ if (unlikely(test_and_set_bit(NVME_REQ_COMPLETE, &rq->flags)))
+ return true;
return blk_mq_complete_request_remote(req);
}
--
2.34.1
More information about the Linux-nvme
mailing list