[PATCH v3 2/3] nvme: make keep-alive synchronous operation
Nilay Shroff
nilay at linux.ibm.com
Tue Oct 8 10:46:45 PDT 2024
On 10/8/24 12:34, Christoph Hellwig wrote:
> On Tue, Oct 08, 2024 at 11:51:51AM +0530, Nilay Shroff wrote:
>> This fix helps avoid race by implementing keep-alive as a synchronous
>> operation so that admin queue-usage ref counter is decremented only
>
> Please spell out q_usage_counter as requested in the first round.
>
Yes sure, I will do it in the next patch revision.
>> after keep-alive command finish execution and returns its status. This
>> would ensure that we don't inadvertently destroy the fabric admin queue
>> until we finish processing of nvme keep-alive request and its status and
>> hence it's safe to delete the queue.
>
> I still fail to see why this requires a synchronous operation vs just
> calling blk_mq_free_request and thus decrementing q_usage_counter
> afrer checking the controller state.
>
> Maybe I'm just dumb and missing the obvious even after the last
> explanation, but then the commit log needs to be improved to explain
> it.
>
OK, I will update the commit log in the next patch revision.
BTW, I just tried experimenting with your suggestion of "removing the
blk_mq_free_request call from nvme_keep_alive_finish function and returning
RQ_END_IO_FREE instead of RQ_END_IO_NONE" and I could still hit the same issue.
The issue here's that after nvme_keep_alive_finish returns back up to the
block layer, the nvme keep-alive thread running the queue dispatcher operation
(and hence accessing the queue resources) while this queue might have been
destroyed on another cpu.
nvme_keep_alive_work()
->blk_execute_rq_no_wait()
->blk_mq_run_hw_queue()
->blk_mq_sched_dispatch_requests()
->__blk_mq_sched_dispatch_requests()
->blk_mq_dispatch_rq_list()
->nvme_loop_queue_rq()
->nvme_fail_nonready_command()
->nvme_complete_rq()
->nvme_end_req()
->blk_mq_end_request()
->__blk_mq_end_request() -- with your suggestion, we now decrement admin->q_usage_counter here
->nvme_keep_alive_finish()
When above call stack returns to __blk_mq_sched_dispatch_requests function,
the admin queue might have been destroyed on another cpu however the
__blk_mq_sched_dispatch_requests could still access the admin queue resources
and causing the crash as reported in the cover letter.
>
>> -static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq,
>> - blk_status_t status)
>> +static void nvme_keep_alive_finish(struct request *rq,
>> + blk_status_t status,
>> + struct nvme_ctrl *ctrl)
>
> And as a nipick, this should be:
>
> static void nvme_keep_alive_finish(struct request *rq, blk_status_t status,
> struct nvme_ctrl *ctrl)
>
>
Yes will do it the next patch.
Thanks,
--Nilay
More information about the Linux-nvme
mailing list