[RFC PATCH] arch_topology: Pre-allocate cacheinfo from primary CPU

Thu Mar 30 16:32:51 PDT 2023

On Thu, 2023-03-30 at 08:57 +0200, Pierre Gondois wrote:
> On 3/29/23 23:35, Radu Rendec wrote:
> > On Wed, 2023-03-29 at 17:39 +0200, Pierre Gondois wrote:
> > > On 3/29/23 17:03, Sudeep Holla wrote:
> > > > On Wed, Mar 29, 2023 at 04:42:07PM +0200, Pierre Gondois wrote:
> > > > > 
> > > > > This would mean that for all architectures, the cacheinfo would come from
> > > > > ACPI/DT first.....
> > > > 
> > > > x86 doesn't fall into the above category. So we need to ensure it continues
> > > > to work with no errors.
> > > 
> > > Ok, then maybe having a second arch specific function like
> > > init_cache_level() would work.
> > > 
> > > This function would be called in fetch_cache_info() after
> > > init_of_cache_level()/acpi_get_cache_info() fail. It would fetch
> > > cache info anywhere but in DT/ACPI.
> > > Archs that don't want it would not implement it, and it would
> > > allow the others to get the num_leaves/levels during early boot.
> > 
> > In particular, for arm64 is it possible that CLIDR_EL1 may not look the
> > same depending on the CPU that reads it? What about SoC's with
> > asymmetrical CPU cores? (no concrete example here, just assuming this
> > is a real/possible thing)
> 
> This would indeed be an issue if all the CPUs don't have the same number/level
> of caches. In case there is no DT/ACPI, it should be possible to:
> - from the primary CPU using CLIDR_EL1, allocate the cacheinfo (making the
>    assumption the platform is symmetrical)
> - from the secondary CPUs, if we know a pre-allocation has been made,
>    run init_cache_level() and check the pre-allocation was correct.
>    If not, re-allocate the cacheinfo (and trigger a warning).

That makes sense. The question here is how do we know a pre-allocation
has been made? If we modified fetch_cache_info() the way you proposed,
then we would always pre-allocate memory, and using the pointer alone
it would be impossible to tell if the pre-allocation is based on a
"guessed" size.

Also I'm not sure about the "trigger a warning" part. If it's an RT
kernel, it will try to reallocate and that will be a "bug" splat
anyway. If not, the reallocation will fix the issue automatically.
Maybe a pr_warn() as opposed to a WARN_ON() (or perhaps that's what you
had in mind as well).

> I think this is more or less what was done in the RFC, the only difference
> being there is no call from smp_prepare_cpus(), or did I miss something ?

Yes, it's more or less the same thing. The main differences are:
 * There is no window when the information in struct cpu_cacheinfo (the
   number of cache levels and leaves) is populated but potentially
   inaccurate (e.g. in the case of asymmetric CPUs with no DT/ACPI).
 * The logic in detect_cache_attributes() remains unchanged because the
   allocated memory is "published" to ci_cpu_cacheinfo only if the size
   is known to be correct (i.e. it comes from DT/ACPI).

Based on what you proposed, I hacked away at the code and came up with
the patch below. It works in my particular test case because the CPUs
are symmetrical, but doesn't solve the reallocation part.

I can't think of an elegant way to solve this. One thing that comes to
mind is change the logic in detect_cache_attributes(). But I am
reluctant because this code is architecture independent, while the
"guess first/fix later" strategy is implementation specific (arm64 in
our case). The main problem is that the call to init_cache_level() is
skipped altogether if there is a valid pointer value in
ci_cpu_cacheinfo. In other words, if the memory has been already
allocated, it is assumed to be correct. This is why I used a separate
per-cpu variable in my original RFC patch.

Would it be horrible to add a bool field to struct cpu_cacheinfo that
tells us if the levels/leaves are based on early detection? The field
would be set by fetch_cache_info() when it calls early_cache_level() as
fallback. Then detect_cache_attributes() could avoid skipping the call
to init_cache_level() when the flag is set. The nice thing about this
is that it doesn't affect other architectures - if they don't implement
early_cache_level(), the flag is never set and then the behavior of
detect_cache_attributes() is basically unchanged.

Best regards,
Radu

>From a79542d43a0ece06e13bfd431abc98299a9747f2 Mon Sep 17 00:00:00 2001
From: Radu Rendec <rrendec at redhat.com>
Date: Thu, 30 Mar 2023 18:18:46 -0400
Subject: [PATCH] Allocate cacheinfo early - WIP #1

Signed-off-by: Radu Rendec <rrendec at redhat.com>
---
 arch/arm64/kernel/cacheinfo.c | 22 ++++++++++++++++++++++
 drivers/base/cacheinfo.c | 12 ++++++++++--
 include/linux/cacheinfo.h | 1 +
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index c307f69e9b55..6f5ac5c385f0 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -38,6 +38,27 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 this_leaf->type = type;
 }
+int early_cache_level(unsigned int cpu)
+{
+ unsigned int ctype, level, leaves;
+ struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+ for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
+ ctype = get_cache_type(level);
+ if (ctype == CACHE_TYPE_NOCACHE) {
+ level--;
+ break;
+ }
+ /* Separate instruction and data caches */
+ leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
+ }
+
+ this_cpu_ci->num_levels = level;
+ this_cpu_ci->num_leaves = leaves;
+ return 0;
+}
+
+#if 0
 int init_cache_level(unsigned int cpu)
 {
 unsigned int ctype, level, leaves;
@@ -76,6 +97,7 @@ int init_cache_level(unsigned int cpu)
 this_cpu_ci->num_leaves = leaves;
 return 0;
 }
+#endif
 int populate_cache_leaves(unsigned int cpu)
 {
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index f6573c335f4c..4640671b4547 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -398,6 +398,11 @@ static void free_cache_attributes(unsigned int cpu)
 cache_shared_cpu_map_remove(cpu);
 }
+int __weak early_cache_level(unsigned int cpu)
+{
+ return -ENOENT;
+}
+
 int __weak init_cache_level(unsigned int cpu)
 {
 return -ENOENT;
@@ -446,8 +451,11 @@ int fetch_cache_info(unsigned int cpu)
 */
 this_cpu_ci->num_leaves = levels + split_levels;
 }
- if (!cache_leaves(cpu))
- return -ENOENT;
+ if (!cache_leaves(cpu)) {
+ ret = early_cache_level(cpu);
+ if (ret < 0)
+ return ret;
+ }
 return allocate_cache_info(cpu);
 }
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 908e19d17f49..9788100a7d3c 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -79,6 +79,7 @@ struct cpu_cacheinfo {
 };
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
+int early_cache_level(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int init_of_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);
-- 
2.39.2