[PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data

James Morse james.morse at arm.com
Thu Jul 10 04:15:08 PDT 2025


Hi Ben, Jonathan,

On 07/07/2025 13:32, Jonathan Cameron wrote:
> On Mon, 7 Jul 2025 11:27:06 +0100
> Ben Horgan <ben.horgan at arm.com> wrote:
>> On 7/4/25 18:38, James Morse wrote:
>>> From: Rob Herring <robh at kernel.org>
>>> Use the minimum CPU h/w id of the CPUs associated with the cache for the
>>> cache 'id'. This will provide a stable id value for a given system. As
>>> we need to check all possible CPUs, we can't use the shared_cpu_map
>>> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
>>> we have to walk all CPU nodes and then walk cache levels.
>>>
>>> The cache_id exposed to user-space has historically been 32 bits, and
>>> is too late to change. This value is parsed into a u32 by user-space
>>> libraries such as libvirt:
>>> https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
>>>
>>> Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
>>> is found.

>>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>>> index cf0d455209d7..df593da0d5f7 100644
>>> --- a/drivers/base/cacheinfo.c
>>> +++ b/drivers/base/cacheinfo.c
>>> @@ -183,6 +184,42 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
>>>   	return of_property_read_bool(np, "cache-unified");
>>>   }
>>>   
>>> +static bool match_cache_node(struct device_node *cpu,
>>> +			     const struct device_node *cache_node)
>>> +{
>>> +	for (struct device_node *cache __free(device_node) = of_find_next_cache_node(cpu);  
>> Looks like the creation of this helper function has upset the 
>> device_node reference counting. This first __free(device_node) will only 
>> cause of_node_put() to be called in the case of the early return from 
>> the loop. You've dropped the second __free(device_node) which accounts 
>> for 'cache' changing on each iteration.

Heh, I just took this hunk verbatim. Fixing this up with the __free() magic is tricky as
the existing patterns all drop the reference to cpu, which we don't want to do here. I
think at this point the __free() magic is just making this harder to understand. How about
the old fashioned way:

| static bool match_cache_node(struct device_node *cpu,
|                              const struct device_node *cache_node)
| {
|         struct device_node *prev, *cache = of_find_next_cache_node(cpu);
|
|         while (cache) {
|                 if (cache == cache_node) {
|                         of_node_put(cache);
|                         return true;
|                 }
|
|                 prev = cache;
|                 cache = of_find_next_cache_node(cache);
|                 of_node_put(prev);
|         }
|
|         return false;
| }


> Good catch - this behaves differently from many of the of_get_next* type
> helpers in that it doesn't drop the reference to the previous iteration
> within the call.
> 
> Maybe it should?
> 
> I checked a few of the call sites and some would be simplified if it did
> others would need some more complex restructuring but might benefit as
> well.

If it did, we'd end up dropping the reference to cpu on the way in, which
of_get_next_cpu_node() in for_each_of_cpu_node() was expecting to do.


Thanks,

James



More information about the linux-arm-kernel mailing list