[PATCH v2] arm64/mpam: Clean MBWU monitor overflow bit

Zeng Heng zengheng4 at huawei.com
Tue Nov 4 05:48:47 PST 2025


Hi Ben,

On 2025/11/4 18:24, Ben Horgan wrote:
> Hi Zeng,
> 
> On 11/3/25 03:47, Zeng Heng wrote:
>> Hi Ben,
>>
>> On 2025/10/30 17:52, Ben Horgan wrote:
>>> Hi Zeng,
>>>
>>> On 10/29/25 07:56, Zeng Heng wrote:
>>>> The MSMON_MBWU register accumulates counts monotonically forward and
>>>> would not automatically cleared to zero on overflow. The overflow
>>>> portion
>>>> is exactly what mpam_msmon_overflow_val() computes, there is no need to
>>>> additionally subtract mbwu_state->prev_val.
>>>>
>>>> Before invoking write_msmon_ctl_flt_vals(), the overflow bit of the
>>>> MSMON_MBWU register must first be read to prevent it from being
>>>> inadvertently cleared by the write operation.
>>>>
>>>> Finally, use the overflow bit instead of relying on counter wrap-around
>>>> to determine whether an overflow has occurred, that avoids the case
>>>> where
>>>> a wrap-around (now > prev_val) is overlooked. So with this, prev_val no
>>>> longer has any use and remove it.
>>>>
>>>> CC: Ben Horgan <ben.horgan at arm.com>
>>>> Signed-off-by: Zeng Heng <zengheng4 at huawei.com>
>>>> ---
>>>>    drivers/resctrl/mpam_devices.c  | 22 +++++++++++++++++-----
>>>>    drivers/resctrl/mpam_internal.h |  3 ---
>>>>    2 files changed, 17 insertions(+), 8 deletions(-)
>>>
>>> This all looks fine for overflow, but what we've been forgetting about
>>> is the power management. As James mentioned in his commit message, the
>>> prev_val is after now check is doing double duty. If an msc is powered
>>> down and reset then we lose the count. Hence, to keep an accurate count,
>>> we should be considering this case too.
>>>
>>
>>
>> Regarding CPU power management and CPU on-/off-line scenarios, this
>> should and already has been handled by mpam_save_mbwu_state():
>>
>> 1. Freezes the current MSMON_MBWU counter into the
>> mbwu_state->correction;
>> 2. Clears the MSMON_MBWU counter;
>>
>> After the CPU is powered back on, the total bandwidth traffic is
>> MSMON_MBWU(the `now` variable) + correction.
>>
>> So the above solution also covers CPU power-down scenarios, and no
>> additional code is needed to adapt to this case.
>>
>> If I've missed anything, thanks in advance to point it out.
>>
> 
> No, I don't think you missed anything. You just didn't mention in your commit message
> that this is also fixing the power management case.
> 
> I'm going to post the next version of this series for James as he is otherwise engaged.
> I've taken your patch and adapted it to fit in with the order of patches.
> Does this look ok to you? The support for the long counters will be added later.
> 

Yes, I have reviewed the patch, and the related adaptations look good to
me.

> @@ -1016,6 +1025,9 @@ static void __ris_msmon_read(void *arg)
>          if (config_mismatch) {
>                  write_msmon_ctl_flt_vals(m, ctl_val, flt_val);
>                  overflow = false;
> +       } else if (overflow) {
> +               mpam_write_monsel_reg(msc, CFG_MBWU_CTL,
> +                                     cur_ctl & ~MSMON_CFG_x_CTL_OFLOW_STATUS);
>          }

Yes, the clear register operation is added here.



Best Regards,
Zeng Heng




More information about the linux-arm-kernel mailing list