[PATCH v1 01/10] ufs: host: mediatek: Fix runtime suspend error deadlock

Peter Wang (王信友) peter.wang at mediatek.com
Mon Sep 22 01:37:10 PDT 2025


On Fri, 2025-09-19 at 13:57 -0700, Bart Van Assche wrote
> 
> If the suspend callback waits for error handling to finish and the
> error handler waits until resuming has finished, isn't this an issue
> that can occur for any UFS host controller and hence that should be
> fixed in the UFSHCI driver core rather than in one host driver only?
> 
> Why is the hba->pm_op_in_progress variable not sufficient to prevent
> this deadlock? Should this code perhaps be moved from
> ufshcd_eh_host_reset_handler() into ufshcd_err_handler()?
> 
>         /*
>          * If runtime PM sent SSU and got a timeout,
> scsi_error_handler is
>          * stuck in this function waiting for flush_work(&hba-
> >eh_work). And
>          * ufshcd_err_handler(eh_work) is stuck waiting for runtime
> PM. Do
>          * ufshcd_link_recovery instead of eh_work to prevent
> deadlock.
>          */
>         if (hba->pm_op_in_progress) {
>                 if (ufshcd_link_recovery(hba))
>                         err = FAILED;
> 
>                 return err;
>         }
> 

Hi Bart,

Okay, you prefer to check pm_op_in_progress before getting 
runtime PM, like below patch? 
If yes, I will remove this patch and check this in ufs core.

@@ -6625,6 +6625,11 @@ static void ufshcd_err_handler(struct
work_struct *work)
        }
        spin_unlock_irqrestore(hba->host->host_lock, flags);

+       if (hba->pm_op_in_progress) {
+               ufshcd_link_recovery(hba);
+               return;
+       }
+
        ufshcd_err_handling_prepare(hba);


> > > How can ufs_mtk_suspend() be called while the error handler is in
> > > progress? ufshcd_err_handler() disables RPM before it sets the
> > > UFSHCD_EH_IN_PROGRESS flag.
> > 
> > This error is triggered by ufs_mtk_auto_hibern8_disable,
> > As the comment description
> > /* May trigger EH work without exiting hibern8 error */
> > so it could happen during the suspend period.
> 
> That source code comment is confusing me, especially the "without
> exiting hibern8 error" part. Do you really want to say that the
> device
> is in a hibernation error state and remains in a hibernation error
> state?
> 

No, it just means that when exiting hibernate,
err = ufs_mtk_auto_hibern8_disable(hba);
err could be 0.
But the UIC error could be triggered by an interrupt.


> > > The UFSHCD_EH_IN_PROGRESS definition and also the
> > > ufshcd_set_eh_in_progress() and ufshcd_clear_eh_in_progress()
> > > definitions must remain in the UFS core private code. Please do
> > > not
> > > move
> > > these definitions into the include/ufs/ufshcd.h header file.
> > 
> > Do you think we should check ufshcd_eh_in_progress in
> > __ufshcd_wl_suspend? I'm not sure, because we don't see this
> > error on all UFS hosts — the vendor suspend operations
> > (ufshcd_vops_suspend) could be different.
> 
> Why is auto-hibernation disabled during suspend? As far as I know the
> UFSHCI standard allows to keep auto-hibernation enabled during
> suspend.
> 
> Thanks,
> 
> Bart.


This is a limitation of MediaTek’s SoC.
If auto-hibernate is triggered concurrently with manual
hibernate, it may cause errors. Therefore, we disable 
auto-hibernate before issuing a manual hibernate command.

Thanks.
Peter



More information about the Linux-mediatek mailing list