[PATCH v4] bus: mhi: host: don't free bhie tables during suspend/hibernation

Wed May 14 00:17:41 PDT 2025

On 5/13/25 8:16 PM, Jeff Hugo wrote:
> On 5/13/2025 12:44 AM, Muhammad Usama Anjum wrote:
>> On 5/12/25 11:46 PM, Jeff Hugo wrote:
>>> On 5/6/2025 8:49 AM, Muhammad Usama Anjum wrote:
>>>> Fix dma_direct_alloc() failure at resume time during bhie_table
>>>> allocation because of memory pressure. There is a report where at
>>>> resume time, the memory from the dma doesn't get allocated and MHI
>>>> fails to re-initialize.
>>>>
>>>> To fix it, don't free the memory at power down during suspend /
>>>> hibernation. Instead, use the same allocated memory again after every
>>>> resume / hibernation. This patch has been tested with resume and
>>>> hibernation both.
>>>>
>>>> The rddm is of constant size for a given hardware. While the fbc_image
>>>> size depends on the firmware. If the firmware changes, we'll free and
>>>> allocate new memory for it.
>>>
>>> Why is it valid to load new firmware as a result of suspend?  I don't
>>> users would expect that.
>> I'm not sure its valid or not. Like other users, I also don't expect
>> that firmware would get changed. It doesn't seem to be tested and hence
>> supported case.
>>
>> But other drivers have code which have implementation like this. I'd
>> mentioned previously that this patch was motivated from the ath12k [1]
>> and ath11k [2] patches. They don't free the memory and reuse the same
>> memory if new size is same.
> 
> It feels like this justification needs to be detailed in the commit
> text. I suspect at some point we'll have another MHI device where the FW
> will need to be cached.
Okay. I'll add this information to the commit message. Currently I've
not seen firmware caching being used other than graphics driver.

> 
>>>> diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c
>>>> index efa3b6dddf4d2..bc8459798bbee 100644
>>>> --- a/drivers/bus/mhi/host/boot.c
>>>> +++ b/drivers/bus/mhi/host/boot.c
>>>> @@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller
>>>> *mhi_cntrl)
>>>>         * device transitioning into MHI READY state
>>>>         */
>>>>        if (fw_load_type == MHI_FW_LOAD_FBC) {
>>>
>>> Why is this FBC specific?
>> It seems we allocate fbc_image only when firmware load type is
>> FW_LOAD_FBC. I'm just optimizing the buffer allocation here.
> 
> We alloc bhie tables in non FBC usecases. Is this somehow an FBC
> specific issue? Perhaps you could clarify the limits of this solution in
> the commit text?
Okay. I'll add information that we are optimizing the bhie allocations.
It has nothing to do with firmware type. I've found only 2 bhie
allocations; fbc_image and rddm_image. So we are optimizing those.

> 
>>
>>>
>>>> -        ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image,
>>>> fw_sz);
>>>> -        if (ret) {
>>>> -            release_firmware(firmware);
>>>> -            goto error_fw_load;
>>>> +        if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) {
>>>> +            mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image);
>>>> +            mhi_cntrl->fbc_image = NULL;
>>>> +        }
>>>> +        if (!mhi_cntrl->fbc_image) {
>>>> +            ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl-
>>>>> fbc_image, fw_sz);
>>>> +            if (ret) {
>>>> +                release_firmware(firmware);
>>>> +                goto error_fw_load;
>>>> +            }
>>>> +            mhi_cntrl->prev_fw_sz = fw_sz;
>>>>            }
>>>>              /* Load the firmware into BHIE vec table */
>>>> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
>>>> index e6c3ff62bab1d..107d71b4cc51a 100644
>>>> --- a/drivers/bus/mhi/host/pm.c
>>>> +++ b/drivers/bus/mhi/host/pm.c
>>>> @@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller
>>>> *mhi_cntrl, bool graceful)
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(mhi_power_down);
>>>>    +static void __mhi_power_down_unprepare_keep_dev(struct
>>>> mhi_controller *mhi_cntrl)
>>>> +{
>>>> +    mhi_cntrl->bhi = NULL;
>>>> +    mhi_cntrl->bhie = NULL;
>>>
>>> Why?
>> This function is shorter version of mhi_unprepare_after_power_down(). As
>> we need different code path in case of suspend/hibernation case, I was
>> adding a new API which Mani asked me remove and consolidate into
>> mhi_power_down_keep_dev() instead. So this static function has been
>> added. [3]
> 
> I don't understand the need to zero these out.  Also, if you are copying
> part of the functionality of mhi_unprepare_after_power_down(), shouldn't
> that functionality be moved into your new API to eliminate duplication?
This how the cleanup works mhi_unprepare_after_power_down(). Yeah, it
makes sense to use this function in mhi_unprepare_after_power_down().

Sending next version soon.
> 


-- 
Regards,
Usama