ath11k resume fails due to kernel blocks probing MHI virtual devices

Manivannan Sadhasivam mani at kernel.org
Mon Jan 29 04:31:12 PST 2024


On Mon, Jan 29, 2024 at 01:22:27PM +0100, Rafael J. Wysocki wrote:
> On Mon, Jan 29, 2024 at 11:10 AM Baochen Qiang <quic_bqiang at quicinc.com> wrote:
> >
> > Hi Rafael and Pavel,
> >
> > Currently I am facing an ath11k (a kernel WLAN driver) resume issue
> > related with kernel PM framework and MHI module.
> >
> > Before introducing the issue details, I'd like to summarize how ath11k
> > interacts with MHI stack to download WLAN firmware to hardware target:
> > 1. when booting/restarting, ath11k powers on MHI module and waits for
> > MHI channels to be ready.
> > 2. When power on, MHI stack creates some virtual MHI devices, which
> > represents MHI hardware channels, and adds them to MHI bus. This
> > triggers MHI client driver, named QRTR, to get matched and probe those
> > MHI devices. In probe, QRTR initializes MHI channels and finally move
> > them to ready state.
> > 3. Once MHI channels ready, ath11k downloads WLAN firmware to hardware
> > target, then WLAN is working.
> >
> > Such an flow works well in general, but introduces issues in hibernation
> > cycle: when preparing for hibernation, ath11k powers down MHI, this
> > results in MHI devices being destroyed thus QRTR resets MHI channels.
> > When resuming back from hibernation, ath11k powers on MHI and waits for
> > MHI channels to be ready in its resume callback. As said above, MHI
> > creates and adds MHI devices to MHI bus, but they can't be probed at
> > that time because device probe is prohibited in device_block_probing(),
> > finally this results in ath11k resume timeout.
> >
> > Now there is an potential fix to this issue which would needs changes in
> > MHI stack, i.e., don't destroy MHI devices while hibernating.
> 
> Exactly.
> 

During hibernation, the power to ath11k could be lost and in that case, there
will be no channels available from the device. So keeping the "struct dev" when
there is no real device attached to the system, goes against the driver model
IMO since we would be messing with the refcount.

For instance in the case of USB, if the device get's unplugged, would it make
sense to keep the "struct dev" for the device in kernel in a hope that it would
come back again?

The driver model as I understood is, once the actual physical device gets
removed, the refcount for "struct dev" should be decremented and it should be
destroyed.

- Mani

-- 
மணிவண்ணன் சதாசிவம்



More information about the ath11k mailing list