[RFC PATCH] arch_topology: Pre-allocate cacheinfo from primary CPU

Wed Mar 29 23:57:24 PDT 2023

On 3/29/23 23:35, Radu Rendec wrote:
> On Wed, 2023-03-29 at 17:39 +0200, Pierre Gondois wrote:
>> On 3/29/23 17:03, Sudeep Holla wrote:
>>> On Wed, Mar 29, 2023 at 04:42:07PM +0200, Pierre Gondois wrote:
>>>>
>>>> This would mean that for all architectures, the cacheinfo would come from
>>>> ACPI/DT first.....
>>>
>>> x86 doesn't fall into the above category. So we need to ensure it continues
>>> to work with no errors.
>>
>> Ok, then maybe having a second arch specific function like
>> init_cache_level() would work.
>>
>> This function would be called in fetch_cache_info() after
>> init_of_cache_level()/acpi_get_cache_info() fail. It would fetch
>> cache info anywhere but in DT/ACPI.
>> Archs that don't want it would not implement it, and it would
>> allow the others to get the num_leaves/levels during early boot.
> 
> Hello Pierre,
> 
> If I understand correctly, in the case of arm64 this new function would
> use CLIDR_EL1 to detect the number of leaves/levels, right? But since
> init_cpu_topology() calls fetch_cache_info() for each CPU, doesn't this
> mean we would end up doing CLIDR_EL1 based detection for the secondary
> CPUs by running the (arch specific) detection code on the primary CPU?

Yes indeed, this would rely on the assumption made in the RFC that
the platform is symmetrical (i.e. all CPUs have the same number/level
of caches).

> 
> My intimate knowledge of arm64 is very limited, but I *assumed* one of
> the reasons why detect_cache_attributes() (and init_cache_level()) run
> on the secondary CPU today is because not all CPUs are necessarily
> identical. Another possible reason I can think of is because maybe on
> some architectures auto-detection isn't possible altogether before the
> secondary CPU is brought up.

Yes I think you are right.

> 
> In particular, for arm64 is it possible that CLIDR_EL1 may not look the
> same depending on the CPU that reads it? What about SoC's with
> asymmetrical CPU cores? (no concrete example here, just assuming this
> is a real/possible thing)

This would indeed be an issue if all the CPUs don't have the same number/level
of caches. In case there is no DT/ACPI, it should be possible to:
- from the primary CPU using CLIDR_EL1, allocate the cacheinfo (making the
   assumption the platform is symmetrical)
- from the secondary CPUs, if we know a pre-allocation has been made,
   run init_cache_level() and check the pre-allocation was correct.
   If not, re-allocate the cacheinfo (and trigger a warning).

I think this is more or less what was done in the RFC, the only difference
being there is no call from smp_prepare_cpus(), or did I miss something ?

Regards,
Pierre