[PATCH v1] ufs: core: fix ufshcd_abort_all racing issue
Peter Wang (王信友)
peter.wang at mediatek.com
Mon Jun 24 02:13:42 PDT 2024
On Fri, 2024-06-21 at 10:15 -0700, Bart Van Assche wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On 6/21/24 4:30 AM, peter.wang at mediatek.com wrote:
> > [ ... ]
>
> Fixes: and Cc: stable tags are missing from this patch. Please add
> these.
>
Hi Bart,
Okay, will add next version. Thanks.
> > diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-
> mcq.c
> > index 8944548c30fa..3b2e5bcb08a7 100644
> > --- a/drivers/ufs/core/ufs-mcq.c
> > +++ b/drivers/ufs/core/ufs-mcq.c
> > @@ -512,8 +512,9 @@ int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba,
> int task_tag)
> > return -ETIMEDOUT;
> >
> > if (task_tag != hba->nutrs - UFSHCD_NUM_RESERVED) {
> > -if (!cmd)
> > -return -EINVAL;
> > +/* Should return 0 if cmd is already complete by irq */
> > +if (!cmd || !ufshcd_cmd_inflight(cmd))
> > +return 0;
> > hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(cmd));
> > } else {
> > hwq = hba->dev_cmd_queue;
>
> This code change reduces the race window but does not eliminate it
> since
> no locks are held around this code path.
>
> Additionally, are you sure that this change is necessary? I don't
> think
> that ufshcd_err_handler() calls ufshcd_mcq_sq_cleanup().
>
> Thanks,
>
> Bart.
Hi Bart,
Yes, this code reduces the race window only.
Beacuse if we want get mcq lock, we need know witch hwq[i] lock
can use.
But before we get which hwq[i] lock could use, we will get this null
pointer error.
This is quite the chicken or the eqq dilemma.
Could you have any suggestion?
Additionally, here is the backtrace of ufshcd_mcq_sq_cleanup.
ufshcd_try_to_abort_task: cmd pending in the device. tag = 6
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000194
pc : [0xffffffd589679bf8] blk_mq_unique_tag+0x8/0x14
lr : [0xffffffd5862f95b4] ufshcd_mcq_sq_cleanup+0x6c/0x1cc
[ufs_mediatek_mod_ise]
Workqueue: ufs_eh_wq_0 ufshcd_err_handler [ufs_mediatek_mod_ise]
Call trace:
dump_backtrace+0xf8/0x148
show_stack+0x18/0x24
dump_stack_lvl+0x60/0x7c
dump_stack+0x18/0x3c
mrdump_common_die+0x24c/0x398 [mrdump]
ipanic_die+0x20/0x34 [mrdump]
notify_die+0x80/0xd8
die+0x94/0x2b8
__do_kernel_fault+0x264/0x298
do_page_fault+0xa4/0x4b8
do_translation_fault+0x38/0x54
do_mem_abort+0x58/0x118
el1_abort+0x3c/0x5c
el1h_64_sync_handler+0x54/0x90
el1h_64_sync+0x68/0x6c
blk_mq_unique_tag+0x8/0x14
ufshcd_clear_cmd+0x34/0x118 [ufs_mediatek_mod_ise]
ufshcd_try_to_abort_task+0x2c8/0x5b4 [ufs_mediatek_mod_ise]
ufshcd_err_handler+0xa7c/0xfa8 [ufs_mediatek_mod_ise]
process_one_work+0x208/0x4fc
worker_thread+0x228/0x438
kthread+0x104/0x1d4
ret_from_fork+0x10/0x20
Thanks.
Peter
More information about the Linux-mediatek
mailing list