ath11k resume fails due to kernel blocks probing MHI virtual devices

Manivannan Sadhasivam mani at kernel.org
Mon Jan 29 04:47:11 PST 2024


On Mon, Jan 29, 2024 at 01:37:41PM +0100, Rafael J. Wysocki wrote:
> On Mon, Jan 29, 2024 at 1:31 PM Manivannan Sadhasivam <mani at kernel.org> wrote:
> >
> > On Mon, Jan 29, 2024 at 01:22:27PM +0100, Rafael J. Wysocki wrote:
> > > On Mon, Jan 29, 2024 at 11:10 AM Baochen Qiang <quic_bqiang at quicinc.com> wrote:
> > > >
> > > > Hi Rafael and Pavel,
> > > >
> > > > Currently I am facing an ath11k (a kernel WLAN driver) resume issue
> > > > related with kernel PM framework and MHI module.
> > > >
> > > > Before introducing the issue details, I'd like to summarize how ath11k
> > > > interacts with MHI stack to download WLAN firmware to hardware target:
> > > > 1. when booting/restarting, ath11k powers on MHI module and waits for
> > > > MHI channels to be ready.
> > > > 2. When power on, MHI stack creates some virtual MHI devices, which
> > > > represents MHI hardware channels, and adds them to MHI bus. This
> > > > triggers MHI client driver, named QRTR, to get matched and probe those
> > > > MHI devices. In probe, QRTR initializes MHI channels and finally move
> > > > them to ready state.
> > > > 3. Once MHI channels ready, ath11k downloads WLAN firmware to hardware
> > > > target, then WLAN is working.
> > > >
> > > > Such an flow works well in general, but introduces issues in hibernation
> > > > cycle: when preparing for hibernation, ath11k powers down MHI, this
> > > > results in MHI devices being destroyed thus QRTR resets MHI channels.
> > > > When resuming back from hibernation, ath11k powers on MHI and waits for
> > > > MHI channels to be ready in its resume callback. As said above, MHI
> > > > creates and adds MHI devices to MHI bus, but they can't be probed at
> > > > that time because device probe is prohibited in device_block_probing(),
> > > > finally this results in ath11k resume timeout.
> > > >
> > > > Now there is an potential fix to this issue which would needs changes in
> > > > MHI stack, i.e., don't destroy MHI devices while hibernating.
> > >
> > > Exactly.
> > >
> >
> > During hibernation, the power to ath11k could be lost and in that case, there
> > will be no channels available from the device. So keeping the "struct dev" when
> > there is no real device attached to the system, goes against the driver model
> > IMO since we would be messing with the refcount.
> 
> But this is system hibernation or suspend and the reason for the power
> loss is quite different from device removal at run time.
> 
> The device is going to be back during resume (or at least it is not
> expected to go away in the meantime), so it is pointless to destroy
> its representation in memory.
> 
> > For instance in the case of USB, if the device get's unplugged, would it make
> > sense to keep the "struct dev" for the device in kernel in a hope that it would
> > come back again?
> 
> At run time - no, during system suspend - yes.
> 
> It is not even recommended to free IRQs during system suspend.
> 

Hmm, okay. Thanks for clearing it up.

> > The driver model as I understood is, once the actual physical device gets
> > removed, the refcount for "struct dev" should be decremented and it should be
> > destroyed.
> 
> Not really.
> 

Okay. My undestanding seem to be wrong then. I will move forward with the
proposal to keep the devices.

- Mani

-- 
மணிவண்ணன் சதாசிவம்



More information about the ath11k mailing list