[PATCH v2] bus: mhi: host: don't free bhie tables during suspend/hibernation

Manivannan Sadhasivam manivannan.sadhasivam at linaro.org
Fri Apr 25 07:47:14 PDT 2025


On Fri, Apr 25, 2025 at 04:41:43PM +0500, Muhammad Usama Anjum wrote:
> On 4/25/25 1:59 PM, Manivannan Sadhasivam wrote:
> > On Fri, Apr 25, 2025 at 12:42:38PM +0500, Muhammad Usama Anjum wrote:
> >> On 4/25/25 12:32 PM, Manivannan Sadhasivam wrote:
> >>> On Fri, Apr 25, 2025 at 12:14:39PM +0500, Muhammad Usama Anjum wrote:
> >>>> On 4/25/25 12:04 PM, Manivannan Sadhasivam wrote:
> >>>>> On Thu, Apr 10, 2025 at 07:56:54PM +0500, Muhammad Usama Anjum wrote:
> >>>>>> Fix dma_direct_alloc() failure at resume time during bhie_table
> >>>>>> allocation. There is a crash report where at resume time, the memory
> >>>>>> from the dma doesn't get allocated and MHI fails to re-initialize.
> >>>>>> There may be fragmentation of some kind which fails the allocation
> >>>>>> call.
> >>>>>>
> >>>>>
> >>>>> If dma_direct_alloc() fails, then it is a platform limitation/issue. We cannot
> >>>>> workaround that in the device drivers. What is the guarantee that other drivers
> >>>>> will also continue to work? Will you go ahead and patch all of them which
> >>>>> release memory during suspend?
> >>>>>
> >>>>> Please investigate why the allocation fails. Even this is not a device issue, so
> >>>>> we cannot add quirks :/
> >>>> This isn't a platform specific quirk. We are only hitting it because
> >>>> there is high memory pressure during suspend/resume. This dma allocation
> >>>> failure can happen with memory pressure on any device.
> >>>>
> >>>
> >>> Yes.
> >> Thanks for understanding.
> >>
> >>>
> >>>> The purpose of this patch is just to make driver more robust to memory
> >>>> pressure during resume.
> >>>>
> >>>> I'm not sure about MHI. But other drivers already have such patches as
> >>>> dma_direct_alloc() is susceptible to failures when memory pressure is
> >>>> high. This patch was motivated from ath12k [1] and ath11k [2].
> >>>>
> >>>
> >>> Even if we patch the MHI driver, the issue is going to trip some other driver.
> >>> How does the DMA memory goes low during resume? So some other driver is
> >>> consuming more than it did during probe()?
> >> Think it like this. The first probe happens just after boot. Most of the
> >> RAM was empty. Then let's say user launches applications which not only
> >> consume entire RAM but also the Swap. The DMA memory area is the first
> >> ~4GB on x86_64 (if I'm not mistaken). Now at resume time when we want to
> >> allocate memory from dma, it may not be available entirely or because of
> >> fragmentation we cannot allocate that much contiguous memory.
> >>
> > 
> > Looks like you have a workload that consumes the limited DMA coherent memory.
> > Most likely the GPU applications I think.
> > 
> >> In our testing and real world cases, right now only wifi driver is
> >> misbehaving. Wifi is also very important. So we are hoping to make wifi
> >> driver robust.
> >>
> > 
> > Sounds fair. If you want to move forward, please modify the exisiting
> > mhi_power_down_keep_dev() to include this partial unprepare as well:
> > 
> > mhi_power_down_unprepare_keep_dev()
> > 
> > Since both APIs are anyway going to be used together, I don't see a need to
> > introduce yet another API.
> I've looked at usages of mhi_power_down_keep_dev(). Its getting used by
> ath12k and ath11k both. We would have to look at ath12k as well before
> we can change mhi_power_down_keep_dev(). Unfortunately, I don't have
> device using ath12k at hand.
> 

ath12k conversion looks trivial. So please go ahead with this new API conversion
for that driver as well.

- Mani

-- 
மணிவண்ணன் சதாசிவம்



More information about the ath11k mailing list