[PATCH] nvme: introduce panic_on_double_cqe param

Guixin Liu kanie at linux.alibaba.com
Tue Oct 28 18:42:11 PDT 2025



在 2025/10/23 13:14, Chaitanya Kulkarni 写道:
> On 10/22/25 6:54 AM, Guixin Liu wrote:
>> Add a new debug switch to control whether to trigger a kernel crash
>> when duplicate CQEs are detected, in order to preserve the kernel
>> context, such as sq, cq, and so on, for subsequent debugging and
>> analysis.
>>
>> Signed-off-by: Guixin Liu <kanie at linux.alibaba.com>
>> ---
>>    drivers/nvme/host/core.c | 5 +++++
>>    drivers/nvme/host/nvme.h | 3 +++
>>    2 files changed, 8 insertions(+)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index fa4181d7de73..7a3f9129a39c 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -95,6 +95,11 @@ module_param(apst_secondary_latency_tol_us, ulong, 0644);
>>    MODULE_PARM_DESC(apst_secondary_latency_tol_us,
>>    	"secondary APST latency tolerance in us");
>>    
>> +bool panic_on_double_cqe;
>> +EXPORT_SYMBOL_GPL(panic_on_double_cqe);
>> +module_param(panic_on_double_cqe, bool, 0644);
>> +MODULE_PARM_DESC(panic_on_double_cqe, "crash the kernel to save the scene");
>> +
>>    /*
>>     * Older kernels didn't enable protection information if it was at an offset.
>>     * Newer kernels do, so it breaks reads on the upgrade if such formats were
>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>> index 102fae6a231c..24010d5d15ce 100644
>> --- a/drivers/nvme/host/nvme.h
>> +++ b/drivers/nvme/host/nvme.h
>> @@ -595,6 +595,8 @@ static inline u16 nvme_cid(struct request *rq)
>>    	return nvme_cid_install_genctr(nvme_req(rq)->genctr) | rq->tag;
>>    }
>>    
>> +extern bool panic_on_double_cqe;
>> +
>>    static inline struct request *nvme_find_rq(struct blk_mq_tags *tags,
>>    		u16 command_id)
>>    {
>> @@ -612,6 +614,7 @@ static inline struct request *nvme_find_rq(struct blk_mq_tags *tags,
>>    		dev_err(nvme_req(rq)->ctrl->device,
>>    			"request %#x genctr mismatch (got %#x expected %#x)\n",
>>    			tag, genctr, nvme_genctr_mask(nvme_req(rq)->genctr));
>> +		BUG_ON(panic_on_double_cqe);
>>    		return NULL;
>>    	}
>>    	return rq;
>
> I'm really not sure this is a good idea, I'll leave to others.
>
>
> -ck
Yeah, I think so too, and I'd also like to find a more elegant solution.

Best Regards,
Guixin Liu




More information about the Linux-nvme mailing list