[RFC PATCH] arch_topology: Pre-allocate cacheinfo from primary CPU
Pierre Gondois
pierre.gondois at arm.com
Wed Mar 29 07:42:07 PDT 2023
Hello Radu,
On 3/27/23 16:23, Radu Rendec wrote:
> On Mon, 2023-03-27 at 14:02 +0200, Pierre Gondois wrote:
>> About populating the cache info from the CLIDR_EL1 register, it seems
>> the information might be incorrect for DT based system, and the mask
>> of L1 data/instruction caches could be advertised as private for
>> ACPI/DT based systems when no cache information is available.
>> There is a patch-set that should fix this at:
>> https://lore.kernel.org/all/20230327115953.788244-1-pierre.gondois@arm.com/
>
> Hello Pierre,
>
> Thanks for pointing out this issue and creating a patch-set to fix it.
> I will keep an eye on it. This is definitely important, since we are
> planing to rely on the CLIDR_EL1 based detection.
>
>> On 3/23/23 23:42, Radu Rendec wrote:
>>> This patch attempts to enable automatic detection for RT kernels when no
>>> DT/ACPI cache information is available, by pre-allocating cacheinfo
>>> memory on the primary CPU. The allocated memory size depends on the
>>> number of cache leaves, which at that point is unknown without the
>>> DT/ACPI information. What this patch does is guess the number of cache
>>> leaves and pre-allocate memory on the primary CPU, then go back and
>>> reallocate the memory if the guess turns out to be wrong when automatic
>>> detection eventually runs on the secondary CPU. In that case, it will
>>> basically revert to the original behavior and still trigger a splat on
>>> RT kernels. The assumption is that most systems have identical CPUs, so
>>> the number of cache leaves will be the same on the secondary CPUs as the
>>> primary CPU. The "guess" uses the number of leaves of the primary CPU.
>>
>> Would it work to do the pre-allocation in fetch_cache_info() instead of
>> returning '-ENOENT' ? This would allow to not add a new step in
>> smp_prepare_cpus() and let all the logic in cacheinfo.c
>
> That was my initial intention, and I agree it would be much cleaner.
> Unfortunately, the number of cache leaves of the primary CPU hasn't
> been detected/populated at the time when fetch_cache_info() is called.
> That means we don't know how much memory to pre-allocate.
>
> This is the call flow that I recorded (read from top to bottom):
> smp_prepare_cpus
> init_cpu_topology
> parse_acpi_topology
> parse_dt_topology
> fetch_cache_info(cpu)
> init_of_cache_level(cpu)
> acpi_get_cache_info(cpu)
> allocate_cache_info(cpu)
> store_cpu_topology(this_cpu)
> update_siblings_masks
> detect_cache_attributes
It seems that both fetch_cache_info() and init_cache_level()'s role are
to set num_levels/num_leaves variables:
- fetch_cache_info() tries to do it from DT/ACPI information.
- init_cache_level() allows arch specific implementations. In particular,
for arm64, the implementation relies on the content of clidr_el1 and then
tries to complete the cache info with ACPI/DT.
Maybe it would be worth:
- making fetch_cache_info() try to get the cacheinfo from DT/ACPI,
and it it fails fallback to init_cache_level().
- in arm64's init_cache_level() implementation, removing the code that
tries to get the information from ACPI/DT as fetch_cache_info() would
have already done it.
- In detect_cache_attributes(), replace init_cache_level() with
fetch_cache_info().
This would mean that for all architectures, the cacheinfo would come from
ACPI/DT first, and if it fails from init_cache_level(). It would be a big
change but maybe it would be clearer in the end.
What do you think ?
>
>
> The number of cache leaves of the primary CPU is populated inside that
> last call to detect_cache_attributes(). If you can think of any other
> way around it, I am definitely open to suggestions.
>
> On a slightly different note, I think most of this code/mechanism is
> shared with RISC-V, but smp_prepare_cpus() is ARM64 specific. Unless we
> can think of a way to avoid that extra call to pre_alloc_cache_info(),
> we may need one for RISC-V as well.
>
> My main intention was to get some feedback and see if you are willing
> to accept something along these lines upstream. I can improve it and
> send subsequent versions of the patch if needed. Thanks again for
> taking the time to review it.
Please bear in mind I don't have the final take and as what is suggested
would impact other architectures, it might be good to also have a conversation
on lkml,
Regards,
Pierre
>
> Best regards,
> Radu
>
More information about the linux-arm-kernel
mailing list