[PATCH v2] ufs: core: fix ufshcd_abort_all racing issue
Wenchao Hao
haowenchao22 at gmail.com
Thu Jun 27 00:59:00 PDT 2024
On 2024/6/26 11:56, Peter Wang (王信友) wrote:
> On Tue, 2024-06-25 at 09:42 -0700, Bart Van Assche wrote:
>>
>>
>> Please include a full root cause analysis when reposting fixes for
>> the
>> reported crashes. It is not clear to me how it is possible that an
>> invalid pointer is passed to blk_mq_unique_tag() (0x194). As I
>> mentioned
>> in my previous email, freeing a request does not modify the request
>> pointer and does not modify the SCSI command pointer either. As one
>> can
>> derive from the blk_mq_alloc_rqs() call stack, memory for struct
>> request
>> and struct scsi_cmnd is allocated at request queue allocation time
>> and
>> is not freed until the request queue is freed. Hence, for a given
>> tag,
>> neither the request pointer nor the SCSI command pointer changes as
>> long
>> as a request queue exists. Hence my request for an explanation how it
>> is
>> possible that an invalid pointer was passed to blk_mq_unique_tag().
>>
>> Thanks,
>>
>> Bart.
>>
>
> Hi Bart,
>
> Sorry I have not explain root-cause clearly.
> I will add more clear root-cause analyze next version.
>
> And it is not an invalid pointer is passed to blk_mq_unique_tag(),
> I means blk_mq_unique_tag function try access null pointer.
> It is differnt and cause misunderstanding.
>
> The null pinter blk_mq_unique_tag try access is:
> rq->mq_hctx(NULL)->queue_num.
>
Hi Peter,
What is queue_num's offset of blk_mq_hw_ctx in your machine?
gdb vmlinux
(gdb) print /x (int)&((struct blk_mq_hw_ctx *)0)->queue_num
$5 = 0x164
I read your descriptions and wondered a same race flow as you described
following. But I found the offset mismatch, if the racing flow is correct,
then the address accessed in blk_mq_unique_tag() should be 0x164, not 0x194.
Maybe the offset is different between our machine?
What's more, if the racing flow is correct, I did not get how your changes
can address this racing flow.
> The racing flow is:
>
> Thread A
> ufshcd_err_handler step 1
> ufshcd_cmd_inflight(true) step 3
> ufshcd_mcq_req_to_hwq
> blk_mq_unique_tag
> rq->mq_hctx->queue_num step 5
>
> Thread B
> ufs_mtk_mcq_intr(cq complete ISR) step 2
> scsi_done
> ...
> __blk_mq_free_request
> rq->mq_hctx = NULL; step 4
>
> Thanks.
> Peter
>
>
>
>
More information about the Linux-mediatek
mailing list