[PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data

Thu Jul 10 04:24:01 PDT 2025

Hi James and Jonathan,

On 7/10/25 12:15, James Morse wrote:
> Hi Ben, Jonathan,
> 
> On 07/07/2025 13:32, Jonathan Cameron wrote:
>> On Mon, 7 Jul 2025 11:27:06 +0100
>> Ben Horgan <ben.horgan at arm.com> wrote:
>>> On 7/4/25 18:38, James Morse wrote:
>>>> From: Rob Herring <robh at kernel.org>
>>>> Use the minimum CPU h/w id of the CPUs associated with the cache for the
>>>> cache 'id'. This will provide a stable id value for a given system. As
>>>> we need to check all possible CPUs, we can't use the shared_cpu_map
>>>> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
>>>> we have to walk all CPU nodes and then walk cache levels.
>>>>
>>>> The cache_id exposed to user-space has historically been 32 bits, and
>>>> is too late to change. This value is parsed into a u32 by user-space
>>>> libraries such as libvirt:
>>>> https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
>>>>
>>>> Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
>>>> is found.
> 
>>>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>>>> index cf0d455209d7..df593da0d5f7 100644
>>>> --- a/drivers/base/cacheinfo.c
>>>> +++ b/drivers/base/cacheinfo.c
>>>> @@ -183,6 +184,42 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
>>>>    	return of_property_read_bool(np, "cache-unified");
>>>>    }
>>>>    
>>>> +static bool match_cache_node(struct device_node *cpu,
>>>> +			     const struct device_node *cache_node)
>>>> +{
>>>> +	for (struct device_node *cache __free(device_node) = of_find_next_cache_node(cpu);
>>> Looks like the creation of this helper function has upset the
>>> device_node reference counting. This first __free(device_node) will only
>>> cause of_node_put() to be called in the case of the early return from
>>> the loop. You've dropped the second __free(device_node) which accounts
>>> for 'cache' changing on each iteration.
> 
> Heh, I just took this hunk verbatim. Fixing this up with the __free() magic is tricky as
> the existing patterns all drop the reference to cpu, which we don't want to do here. I
> think at this point the __free() magic is just making this harder to understand. How about
> the old fashioned way:
> 
> | static bool match_cache_node(struct device_node *cpu,
> |                              const struct device_node *cache_node)
> | {
> |         struct device_node *prev, *cache = of_find_next_cache_node(cpu);
> |
> |         while (cache) {
> |                 if (cache == cache_node) {
> |                         of_node_put(cache);
> |                         return true;
> |                 }
> |
> |                 prev = cache;
> |                 cache = of_find_next_cache_node(cache);
> |                 of_node_put(prev);
> |         }
> |
> |         return false;
> | }
Ok with me.
> 
> 
>> Good catch - this behaves differently from many of the of_get_next* type
>> helpers in that it doesn't drop the reference to the previous iteration
>> within the call.
>>
>> Maybe it should?
>>
>> I checked a few of the call sites and some would be simplified if it did
>> others would need some more complex restructuring but might benefit as
>> well.
> 
> If it did, we'd end up dropping the reference to cpu on the way in, which
> of_get_next_cpu_node() in for_each_of_cpu_node() was expecting to do.

Yes, I think the blurring of the lines between a cpu node and cache node 
is at least partially to blame for the confusion here.
> 
> 
> Thanks,
> 
> James

Thanks,

Ben