[RFC PATCH 01/38] arm64: mpam: Context switch the MPAM registers

Ben Horgan ben.horgan at arm.com
Thu Dec 18 07:54:48 PST 2025



On 12/18/25 15:38, Jonathan Cameron wrote:
> On Thu, 18 Dec 2025 14:55:23 +0000
> Ben Horgan <ben.horgan at arm.com> wrote:
> 
>> On 12/18/25 14:52, Ben Horgan wrote:
>>> Hi Jonathan,
>>>
>>> On 12/18/25 10:35, Jonathan Cameron wrote:  
>>>> On Fri, 5 Dec 2025 21:58:24 +0000
>>>> James Morse <james.morse at arm.com> wrote:
>>>>  
>>>>> MPAM allows traffic in the SoC to be labeled by the OS, these labels
>>>>> are used to apply policy in caches and bandwidth regulators, and to
>>>>> monitor traffic in the SoC. The label is made up of a PARTID and PMG
>>>>> value. The x86 equivalent calls these CLOSID and RMID, but they don't
>>>>> map precisely.
>>>>>
>>>>> MPAM has two CPU system registers that is used to hold the PARTID and PMG
>>>>> values that traffic generated at each exception level will use. These can be
>>>>> set per-task by the resctrl file system. (resctrl is the defacto interface
>>>>> for controlling this stuff).
>>>>>
>>>>> Add a helper to switch this.
>>>>>
>>>>> struct task_struct's separate CLOSID and RMID fields are insufficient
>>>>> to implement resctrl using MPAM, as resctrl can change the PARTID (CLOSID)
>>>>> and PMG (sort of like the RMID) separately. On x86, the rmid is an
>>>>> independent number, so a race that writes a mismatched  closid and rmid
>>>>> into hardware is benign. On arm64, the pmg bits extend the partid.
>>>>> (i.e. partid-5 has a pmg-0 that is not the same as partid-6's pmg-0).
>>>>> In this case, mismatching the values will 'dirty' a pmg value that
>>>>> resctrl believes is clean, and is not tracking with its 'limbo' code.
>>>>>
>>>>> To avoid this, the partid and pmg are always read and written as a pair.
>>>>> Instead of making struct task_struct's closid and rmid fields an
>>>>> endian-unsafe union, add the value to struct thread_info and always use
>>>>> READ_ONCE()/WRITE_ONCE() when accessing this field.
>>>>>
>>>>> Resctrl allows a per-cpu 'default' value to be set, this overrides the
>>>>> values when scheduling a task in the default control-group, which has
>>>>> PARTID 0. The way 'code data prioritisation' gets emulated means the
>>>>> register value for the default group needs to be a variable.
>>>>>
>>>>> The current system register value is kept in a per-cpu variable to
>>>>> avoid writing to the system register if the value isn't going to change.
>>>>> Writes to this register may reset the hardware state for regulating
>>>>> bandwidth.
>>>>>
>>>>> Finally, there is no reason to context switch these registers unless
>>>>> there is a driver changing the values in struct task_struct. Hide
>>>>> the whole thing behind a static key. This also allows the driver to
>>>>> disable MPAM in response to errors reported by hardware. Move the
>>>>> existing static key to belong to the arch code, as in the future
>>>>> the MPAM driver may become a loadable module.
>>>>>
>>>>> All this should depend on whether there is an MPAM driver, hide
>>>>> it behind CONFIG_MPAM.
>>>>>
>>>>> CC: Amit Singh Tomar <amitsinght at marvell.com>
>>>>> Signed-off-by: James Morse <james.morse at arm.com>  
>>>>  
>>>>> diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
>>>>> new file mode 100644
>>>>> index 000000000000..86a55176f884
>>>>> --- /dev/null
>>>>> +++ b/arch/arm64/include/asm/mpam.h
>>>>> @@ -0,0 +1,74 @@  
>>>> ...
>>>>  
>>>>> +/*
>>>>> + * The resctrl filesystem writes to the partid/pmg values for threads and CPUs,
>>>>> + * which may race with reads in __mpam_sched_in(). Ensure only one of the old
>>>>> + * or new values are used. Particular care should be taken with the pmg field
>>>>> + * as __mpam_sched_in() may read a partid and pmg that don't match, causing
>>>>> + * this value to be stored with cache allocations, despite being considered
>>>>> + * 'free' by resctrl.
>>>>> + *
>>>>> + * A value in struct thread_info is used instead of struct task_struct as the
>>>>> + * cpu's u64 register format is used, but struct task_struct has two u32'.  
>>>>
>>>> This comment probably wants to provide a little more info if it is to be useful,
>>>>
>>>> Is it a reference to the closid and rmid fields under CONFIG_X86_CPU_RESCTRL?
>>>> I'm not immediately understanding why that matters given you could slap
>>>> a union on it without (I think) resulting in anything else moving.  
>>>
>>> Yes, the fields referred to are those closid and rmid. As James writes
>>> in the commit message a union is an alternative, but it would be endian
>>> unsafe. Unlikely to matter but lets not break things.  
>>
>> Meant to say... I'll add clarification in this vein to the comment.
> 
> Goes to show I didn't read the patch description.  Oops.
> 
> I'm probably just being slow today, but why would it be endian unsafe?
> I didn't think the alternative would be to assume the two uses of the storage
> were valid at the same time but rather just to reuse the space (which would
> have 64bit alignment anyway).  For that matter we could just put a u64 in
> under a separate ifdef CONFIG_...
> 
> Obviously if the code made use of the access to closid / rmid for arm64
> systems it would be a problem.

Yes, I think it would only be unsafe if closid / rmid were accessed, but
if they're not, just reusing the spot is cleaner. I assume James' point
is that as we can't use what we've already got it's ok just to do
something new.

> 
> Anyway just expanding on the comment a bit should do the job with no
> need for any other changes.
> 
> Thanks,
> 
> Jonathan
> 
> 
>>
>>>
>>> I'm replying for James as he is otherwise engaged. Thanks for the review
>>> of this series and all your review on the previous MPAM series.
>>>   
>>>>
>>>> Now having it in thread_info moves it into arch header territory so
>>>> might make sense for that reason.
>>>>  
>>>>> + */
>>>>> +static inline u64 mpam_get_regval(struct task_struct *tsk)
>>>>> +{
>>>>> +#ifdef CONFIG_ARM64_MPAM
>>>>> +	return READ_ONCE(task_thread_info(tsk)->mpam_partid_pmg);
>>>>> +#else
>>>>> +	return 0;
>>>>> +#endif
>>>>> +}  
>>>>  
>>>
>>> Thanks,
>>>
>>> Ben
>>>   
>>
> 

Thanks,

Ben




More information about the linux-arm-kernel mailing list