[RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online

Dave Martin Dave.Martin at arm.com
Mon Jul 28 08:34:27 PDT 2025


Hi,

On Mon, Jul 28, 2025 at 12:59:12PM +0100, Ben Horgan wrote:
> Hi James,
> 
> On 7/11/25 19:36, James Morse wrote:
> > When CPUs come online the original configuration should be restored.
> > Once the maximum partid is known, allocate an configuration array for
> > each component, and reprogram each RIS configuration from this.
> > 
> > The MPAM spec describes how multiple controls can interact. To prevent
> > this happening by accident, always reset controls that don't have a
> > valid configuration. This allows the same helper to be used for
> > configuration and reset.
> > 
> > CC: Dave Martin <Dave.Martin at arm.com>
> > Signed-off-by: James Morse <james.morse at arm.com>
> > ---
> >   drivers/platform/arm64/mpam/mpam_devices.c  | 236 ++++++++++++++++++--
> >   drivers/platform/arm64/mpam/mpam_internal.h |  26 ++-
> >   2 files changed, 234 insertions(+), 28 deletions(-)
> > 
> > diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> > index bb3695eb84e9..f3ecfda265d2 100644
> > --- a/drivers/platform/arm64/mpam/mpam_devices.c
> > +++ b/drivers/platform/arm64/mpam/mpam_devices.c

[...]

> > @@ -1000,10 +1041,38 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)

[...]

> > +static void mpam_reprogram_msc(struct mpam_msc *msc)
> > +{
> > +	int idx;
> > +	u16 partid;
> > +	bool reset;
> > +	struct mpam_config *cfg;
> > +	struct mpam_msc_ris *ris;
> > +
> > +	idx = srcu_read_lock(&mpam_srcu);
> > +	list_for_each_entry_rcu(ris, &msc->ris, msc_list) {
> > +		if (!mpam_is_enabled() && !ris->in_reset_state) {
> > +			mpam_touch_msc(msc, &mpam_reset_ris, ris);
> > +			ris->in_reset_state = true;
> > +			continue;
> > +		}
> > +
> > +		reset = true;
> > +		for (partid = 0; partid <= mpam_partid_max; partid++) {

> Do we need to consider 'partid_max_lock' here?

Just throwing in my 2¢, since I'd dug into this a bit previously:

Here, we are resetting an MSC or re-onlining a CPU.  Either way, I
think that this only happens after the initial probing phase is
complete.

mpam_enable_once() is ordered with respect to the task that did the
final unlock of partid_max_lock during probing, by means of the
schedule_work() call.  (See <linux/workqueue.h>.)

Taking the hotplug lock and installing mpam_cpu_online() for CPU
hotplug probably brings a sufficient guarantee also (though I've not
dug into it).

This function doesn't seem to be called during the probing phase (via
mpam_discovery_cpu_online()), so there shouldn't be any racing updates
to the global variables here.

> > +			cfg = &ris->vmsc->comp->cfg[partid];
> > +			if (cfg->features)
> > +				reset = false;
> > +
> > +			mpam_reprogram_ris_partid(ris, partid, cfg);
> > +		}
> > +		ris->in_reset_state = reset;
> > +	}
> > +	srcu_read_unlock(&mpam_srcu, idx);
> > +}

[...]

> > @@ -1806,6 +1875,43 @@ static void mpam_unregister_irqs(void)

[...]

> > +static int __allocate_component_cfg(struct mpam_component *comp)
> > +{
> > +	if (comp->cfg)
> > +		return 0;
> > +
> > +	comp->cfg = kcalloc(mpam_partid_max + 1, sizeof(*comp->cfg), GFP_KERNEL);

> And here?

Similarly, this runs only in the mpam_enable_once() call.

[...]

> > @@ -1861,6 +1976,8 @@ static void mpam_reset_component_locked(struct mpam_component *comp)
> >   	might_sleep();
> >   	lockdep_assert_cpus_held();
> > +	memset(comp->cfg, 0, (mpam_partid_max * sizeof(*comp->cfg)));

> And here?

Similarly to mpam_reset_msc(), I think this probably only runs from
mpam_enable_once() or mpam_cpu_online().

I think most or all of the existing reads of the affected globals from
within mpam_resctrl.c are also callbacks from resctrl_init(), which
again exceutes during mpam_enable_once() (though I won't promise I
haven't missed one or two).

Once resctrl has fired up, I believe that the MPAM driver basically
trusts the IDs coming in from resctrl, and doesn't need to range-check
them against the global parameters again.

[...]

> Thanks,
> 
> Ben

I consciously haven't done all the homework on this.

Although it may look like the globals are read all over the place after
probing, I think this actually only happens during resctrl initialision
(which is basically single-threaded).

The only place where they are read after probing and without mediation
via resctrl is on the CPU hotplug path.

Adding locking would ensure that an unstable value is never read, but
this is not sufficient by itself to sure that the _final_ value of a
variable is read (for some definition of "final").  And, if there is a
well-defined notion of final value and there is sufficient
synchronisation to ensure that this is the value read by a particular
read, then by construction an unstable value cannot be read.


I think that this kind of pattern is not that uncommon in the kernel,
though it is a bit painful to reason about.

Cheers
---Dave



More information about the linux-arm-kernel mailing list