[PATCH] mmc: race condition between "sdcard hot plug out" and "system reboot"

Ulf Hansson ulf.hansson at linaro.org
Mon May 6 02:53:10 PDT 2024


On Mon, 6 May 2024 at 05:36, Joe.Zhou <Joe.Zhou at mediatek.com> wrote:
>
> From: Joe Zhou <Joe.Zhou at mediatek.com>
>
> > Thanks for your patch!
>
> > Doesn't commit 66c915d09b94 ("mmc: core: Disable card detect during
> > shutdown") take care of this problem?
>
> > Kind regards
> > Uffe
>
>
> Dear Ulf,
>      Thank you for your replay!
>
>      I think that commit66c915d09b94 ("mmc: core: Disable card detect during shutdown") doesn't reslove this issue.
>      1. Issues may asise in the following processing.
>      sdcard hot pulg out:                                  SyS_reboot:
>      CPU0                                                  CPU1
>      _mmc_detect_change() {
>      ......
>      mmc_schedule_delayed_work(&host->detect, delay)
>      #Step1: call delay work &host->detect
>          mmc_rescan()
>          {
>           .......
>               #Step2: detect SD card removed
>               mmc_sd_detect() {                              ......
>                                                              _mmc_stop_host (.pre_shutdown)
>                                                             {
>               ......                                        #Step3:_mmc_stop_host() cancel detect use sync
>                                                             cancel_delayed_work_sync(&host->detect)
>                                                             #Step4: wait delay work complete
>                                                             }
>                  if (err) {
>                  #Step5: host->card is NULL
>                  mmc_sd_remove(host);                        ......

Via mmc_sd_detect() we are also calling device_del(card) and
mmc_detach_bus(). In other words, when _mmc_stop_host() has been
completed, the struct device corresponding to the card should have
been unregistered and host->bus_ops should be NULL.

In the later phase, mmc_bus_shutdown() seems to be called, which is
weird in the first place as the struct device should have been removed
from the bus. Then, even if that is the case, the host->bus_ops should
be NULL, thus it should rather lead to NULL pointer dereference splat
when accessing host->bus_ops->shutdown() callback.

What am I missing here?

>                                                             #Step6: wait delay work complete
>                                                             mmc_sd_suspend (.shutdown)
>                                                             {
>                                                              ......
>
>                                                             #Step7:_mmc_sd_suspend claimed host
>                                                             mmc_claim_host(host);
>                                                             #Step8: use host-card(NULL pointer)
>                                                             if (mmc_card_suspended(host->card))
>                                                              ......
>                                                             }
>                  mmc_claim_host(host);
>                  mmc_detach_bus(host);
>                 }
>              }
>           }
>        ......
>       }
>
>      2. And in the version that includes the patch, we have reproduced the issue.

Can you please send a detailed log-splat of what is happening?
Otherwise I may not be able to help.

>
> Best regards,
> Joe

Kind regards
Uffe



More information about the Linux-mediatek mailing list