[PATCH v1 2/2] ARM: dts: samsung: Add cache information to the Exynos542x SoC
Henrik Grimler
henrik at grimler.se
Tue Sep 9 13:22:18 PDT 2025
Hi Anand,
Thanks for working on this!
On Tue, Sep 09, 2025 at 07:29:31PM +0530, Anand Moon wrote:
[ ... ]
> > > >>>> On 30.07.2024 11:13, Anand Moon wrote:
> > > >>>>> As per the Exynos 5422 user manual add missing cache information to
> > > >>>>> the Exynos542x SoC.
> > > >>>>>
> > > >>>>> - Each Cortex-A7 core has 32 KB of instruction cache and
> > > >>>>> 32 KB of L1 data cache available.
> > > >>>>> - Each Cortex-A15 core has 32 KB of L1 instruction cache and
> > > >>>>> 32 KB of L1 data cache available.
> > > >>>>> - The little (A7) cluster has 512 KB of unified L2 cache available.
> > > >>>>> - The big (A15) cluster has 2 MB of unified L2 cache available.
> > > >>>>>
> > > >>>>> Features:
> > > >>>>> - Exynos 5422 support cache coherency interconnect (CCI) bus with
> > > >>>>> L2 cache snooping capability. This hardware automatic L2 cache
> > > >>>>> snooping removes the efforts of synchronizing the contents of the
> > > >>>>> two L2 caches in core switching event.
> > > >>>>>
> > > >>>>> Signed-off-by: Anand Moon <linux.amoon at gmail.com>
> > > >>>>
> > > >>>>
> > > >>>> The provided values are not correct. Please refer to commit 5f41f9198f29
> > > >>>> ("ARM: 8864/1: Add workaround for I-Cache line size mismatch between CPU
> > > >>>> cores"), which adds workaround for different l1 icache line size between
> > > >>>> big and little CPUs. This workaround gets enabled on all Exynos542x/5800
> > > >>>> boards.
> > > >>>>
> > > >>> Ok, I have just referred to the Exynos 5422 user manual for this patch,
> > > >>> This patch is just updating the cache size for CPU for big.litle architecture..
I do not have access to the 5422 manual unfortunately, but if I add
some prints in the code from the commit Marek referenced:
```diff
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -173,6 +173,7 @@ void check_cpu_icache_size(int cpuid)
asm("mrc p15, 0, %0, c0, c0, 1" : "=r" (ctr));
size = 1 << ((ctr & 0xf) + 2);
+ pr_warn("CPU%u: icache line size: %u, size %u\n", cpuid, icache_size, size);
if (cpuid != 0 && icache_size != size)
pr_info("CPU%u: detected I-Cache line size mismatch, workaround enabled\n",
cpuid);
```
Then we get in dmesg:
CPU0: icache line size: 64, size 32
CPU1: icache line size: 32, size 32
CPU2: icache line size: 32, size 32
CPU3: icache line size: 32, size 32
CPU4: icache line size: 32, size 64
CPU5: icache line size: 32, size 64
CPU6: icache line size: 32, size 64
CPU7: icache line size: 32, size 64
I interpret this as that the i-cache-line-size property of CPU4, 5, 6
and 7 (i.e. cpu at 0, cpu at 1, cpu at 2 and cpu at 4) should be 64 instead of 32.
Not sure about the other properties..
> Here's an article that provides detailed insights into the cache feature.
> [0] http://jake.dothome.co.kr/cache4/
>
> The values associated with L1 and L2 caches indicate their respective sizes,
> as specified in the ARM Technical Reference Manual (TRM) below.
>
> Cortex-A15 L2 cache controller
> [0] https://developer.arm.com/documentation/ddi0503/i/programmers-model/programmable-peripherals-and-interfaces/cortex-a15-l2-cache-controller
>
> Cortex-A7 L2 cache controller
> [1] https://developer.arm.com/documentation/ddi0503/i/programmers-model/programmable-peripherals-and-interfaces/cortex-a7-l2-cache-controller
>
> These changes help define a fixed cache size, ensuring that active pages
> are mapped correctly within the expected cache boundaries.
>
> Here is the small test case using perf
> Before
>
> $ sudo perf stat -e L1-dcache-loads,L1-dcache-load-misses ./fact
>
> Simulated Cache Miss Time (avg): 4766632 ns
> Factorial(10) = 3628800
>
> Performance counter stats for './fact':
>
> 926328 armv7_cortex_a15/L1-dcache-loads/
> <not counted> armv7_cortex_a7/L1-dcache-loads/
> (0.00%)
> 16510 armv7_cortex_a15/L1-dcache-load-misses/ #
> 1.78% of all L1-dcache accesses
> <not counted> armv7_cortex_a7/L1-dcache-load-misses/
> (0.00%)
>
> 0.008970031 seconds time elapsed
>
> 0.000000000 seconds user
> 0.009673000 seconds sys
>
> After
> $ sudo perf stat -e L1-dcache-loads,L1-dcache-load-misses ./fact
> Simulated Cache Miss Time (avg): 4623272 ns
> Factorial(10) = 3628800
>
> Performance counter stats for './fact':
>
> 930570 armv7_cortex_a15/L1-dcache-loads/
> <not counted> armv7_cortex_a7/L1-dcache-loads/
> (0.00%)
> 4755 armv7_cortex_a15/L1-dcache-load-misses/ #
> 0.51% of all L1-dcache accesses
> <not counted> armv7_cortex_a7/L1-dcache-load-misses/
> (0.00%)
>
> 0.011068250 seconds time elapsed
>
> 0.000000000 seconds user
> 0.010793000 seconds sys
I tried out the same test on my odroid-xu4, but was not able to
reliably get the same improvement. Cache misses varied between around
0.8 % to around 2.8 %. This was with a desktop UI installed and
though, will try it out in a headless installation in the next few
days, and perhaps try it on exynos5800 as well.
Might be worth also testing on both small and big cores, like:
$ sudo taskset -c 0,1,2,3 perf stat -e L1-dcache-loads,L1-dcache-load-misses ./fact
$ sudo taskset -c 4,5,6,7 perf stat -e L1-dcache-loads,L1-dcache-load-misses ./fact
Best regards,
Henrik Grimler
More information about the linux-arm-kernel
mailing list