[RFC 00/12] ARM: MPAM: add support for priority partitioning control

Thu Aug 17 12:11:09 PDT 2023

(+Tony)

Hi Amit,

On 8/15/2023 8:27 AM, Amit Singh Tomar wrote:
> Arm Memory System Resource Partitioning and Monitoring (MPAM) supports
> different controls that can be applied to different resources in the system
> For instance, an optional priority partitioning control where priority
> value is generated from one MSC, propagates over interconnect to other MSC
> (known as downstream priority), or can be applied within an MSC for internal
> operations.
> 
> Marvell implementation of ARM MPAM supports priority partitioning control
> that allows LLC MSC to generate priority values that gets propagated (along with
> read/write request from upstream) to DDR Block. Within the DDR block the
> priority values is mapped to different traffic class under DDR QoS strategy.
> The link[1] gives some idea about DDR QoS strategy, and terms like LPR, VPR
> and HPR.
> 
> Setup priority partitioning control under Resource control
> ----------------------------------------------------------
> At present, resource control (resctrl) provides basic interface to configure/set-up
> CAT (Cache Allocation Technology) and MBA (Memory Bandwidth Allocation) capabilities.
> ARM MPAM uses it to support controls like Cache portion partition (CPOR), and 
> MPAM bandwidth partitioning.
> 
> As an example, "schemata" file under resource control group contains information about
> cache portion bitmaps, and memory bandwidth allocation, and these are used to configure
> Cache portion partition (CPOR), and MPAM bandwidth partitioning controls.
> 
> MB:0=0100
> L3:0=ffff
> 
> But resctrl doesn't provide a way to set-up other control that ARM MPAM provides
> (For instance, Priority partitioning control as mentioned above). To support this,
> James has suggested to use already existing schemata to be compatible with 
> portable software, and this is the main idea behind this RFC is to have some kind
> of discussion on how resctrl can be extended to support priority partitioning control.
> 
> To support Priority partitioning control, "schemata" file is updated to accommodate
> priority field (upon priority partitioning capability detection), separated from CPBM
> using delimiter ",".
> 
> L3:0=ffff,f where f indicates downstream priority max value.
> 
> These dspri value gets programmed per partition, that can be used to override 
> QoS value coming from upstream (CPU).
> 
> RFC patch-set[2] is based on James Morse's MPAM snapshot[3] for 6.2, and ACPI
> table is based on DEN0065A_MPAM_ACPI_2.0.
>

There are some aspects of this that I think we should be cautious about. First,
there may inevitably be more properties in the future that need to be associated with
a resource allocation, these may indeed be different between architectures
and individual platforms. Second, user space need a way to know which properties
are supported and what valid parameters may be. 

On a high level I thus understand the goal be to add support for assigning a
property to a resource allocation with "Priority partitioning control" being
the first property.

To that end, I have a few questions:
* How can this interface be expanded to support more properties with the
  expectation that a system/architecture may not support all resctrl supported
  properties?
* Is it possible for support for properties to vary between, for example, different
  MSCs in the system? From resctrl side it may mean that there would be a resource,
  for example "L3", with multiple instances, for example, cache with id #0, cache
  with id#1, etc. but the supported properties or valid values of properties
  may vary between the instances?
* How can user space know that a system supports "Priority partitioning control"?
  User space needs to know when/if it can attempt to write a priority to the
  schemata.
* How can user space know what priority values are valid for a particular system?

> Test set-up and results:
> ------------------------
> 
> The downstream priority value feeds into DRAM controller, and one of the important
> thing that it does with this value is to service the requests sooner (based on the 
> traffic class), hence reducing latency without affecting performance.

Could you please elaborate here? I expected reduced latency to have a big impact
on performance.

> 
> Within the DDR QoS traffic class.
> 
> 0--5 ----> Low priority value
> 6-10 ----> Medium priority value
> 11-15 ----> High priority value
> 
> Benchmark[4] used is multichase.
> 
> Two partition P1 and P2:
> 
> Partition P1:
> -------------
> Assigned core 0
> 100% BW assignment
> 
> Partition P2:
> -------------
> Assigned cores 1-79
> 100% BW assignment
> 
> Test Script:
> -----------
> mkdir p1
> cd p1
> echo 1 > cpus
> echo L3:1=8000,5 > schemata   ##### DSPRI set as 5 (lpr)
> echo "MB:0=100" > schemata
> 
> mkdir p2
> cd p2
> echo ffff,ffffffff,fffffffe > cpus
> echo L3:1=8000,0 > schemata
> echo "MB:0=100" > schemata
> 
> ### Loaded latency run, core 0 does chaseload (pointer chase) with low priority value 5, and cores 1-79 does memory bandwidth run ###

Could you please elaborate what is meant with a "memory bandwidth run"?

> ./multiload -v -n 10 -t 80 -m 1G -c chaseload  
> 
> cd /sys/fs/resctrl/p1
> 
> echo L3:1=8000,a > schemata  ##### DSPRI set as 0xa (vpr)
> 
> ### Loaded latency run, core 0 does chaseload (pointer chase) with medium priority value a, and cores 1-79 does memory bandwidth run ###
> ./multiload -v -n 10 -t 80 -m 1G -c chaseload
> 
> cd /sys/fs/resctrl/p1
> 
> echo L3:1=8000,f > schemata  ##### DSPRI set as 0xf (hpr)
> 
> ### Loaded latency run where core 0 does chaseload (pointer chase) with high priority value f, and cores 1-79 does memory bandwidth run ###
> ./multiload -v -n 10 -t 80 -m 1G -c chaseload
> 
> Results[5]:
> 
> LPR average latency is 204.862(ns) vs VPR average latency is 161.018(ns) vs HPR average latency is 134.210(ns).

Reinette