[PATCH v2 27/29] arm_mpam: Add helper to reset saved mbwu state

Fri Oct 10 09:53:25 PDT 2025

Hi Shaopeng,

On 18/09/2025 03:35, Shaopeng Tan (Fujitsu) wrote:
>> resctrl expects to reset the bandwidth counters when the filesystem is
>> mounted.
>>
>> To allow this, add a helper that clears the saved mbwu state. Instead of cross
>> calling to each CPU that can access the component MSC to write to the counter,
>> set a flag that causes it to be zero'd on the the next read. This is easily done by
>> forcing a configuration update.

>> diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
>> index 3080a81f0845..8254d6190ca2 100644
>> --- a/drivers/resctrl/mpam_devices.c
>> +++ b/drivers/resctrl/mpam_devices.c
>> @@ -1112,7 +1122,10 @@ static void __ris_msmon_read(void *arg)
>>  	read_msmon_ctl_flt_vals(m, &cur_ctl, &cur_flt);
>>  	clean_msmon_ctl_val(&cur_ctl);
>>  	gen_msmon_ctl_flt_vals(m, &ctl_val, &flt_val);
>> -	if (cur_flt != flt_val || cur_ctl != (ctl_val | MSMON_CFG_x_CTL_EN))
>> +	config_mismatch = cur_flt != flt_val ||
>> +			  cur_ctl != (ctl_val | MSMON_CFG_x_CTL_EN);
>> +
>> +	if (config_mismatch || reset_on_next_read)
>>  		write_msmon_ctl_flt_vals(m, ctl_val, flt_val);

I don't have a platform that implements any of the bandwidth counters, so may need a hand
to debug this ...

> mbm_handle_overflow() calls __ris_msmon_read() every second. 
> If there are multiple monitor groups, the config_mismatch will "true" every second. 

It shouldn't be - I think you've forced it into a pathalogical case that the resctrl glue
code tries very hard to avoid.

The pattern of allocating a montior, detecing a mismatch and reconfiguring it is needed
for CSU. That stuff is re-usable for MBWU, but you never want it to happen outside
control/monitor group creation because it means you're losing data.

For those reading along at home:
resctrl expects there to be as many hardware monitors as PARTID*PMG - because every
control and monitor group has 'mbm_total_bytes' or equivalent files. User-space can read
these at any time, and the deal is they start at 0 from boot, and reset when the control
or monitor group is created.

This means the MPAM driver needs to have enough, and it needs to pre-configure them on
startup.

The resctrl glue code calls this 'free running'. It means when you call
resctrl_arch_mon_ctx_alloc() for a bandwidth monitor - it doesn't allocate a context, but
returns a magic out of range value 'USE_RMID_IDX' so that subsequent calls use the
pre-allocated monitor.

If you don't have PARTID*PMG's worth of monitors - you can't have resctrl's
mbm_total_bytes interface. People regularly complain about this - but the alternative is
counters that randomly reset, meaning you could never trust the value.
I have no intention of supporting that mode, (its already available in /dev/urandom!)

If you're seeing this mismatch happen from the overflow thread - I think you've forced
the mbwu counters on when you don't have enough monitors.
Even if the resctrl overflow 'thread' used the same mon_ctx - USE_RMID_IDX means it will
access a different hardware monitor each time.
Another option is clean_msmon_ctl_val() is missing a bit that is set by hardware, causing
the values to mismatch when they shouldn't.

Could you check mon_ctx is USE_RMID_IDX, and check which bits are mismatching?

> Then "mbwu_state->prev_val = 0;" in function write_msmon_ctl_flt_vals() will be always run.
> This means that for multiple monitoring groups, the MemoryBandwidth monitoring value is cleared every second.

Yes - this should never happen because the overflow thread should never cause a mismatch,
and the montiros should only be reconfigured when control/monitor groups are allocated.

Thanks,

James