[PATCH 1/3] soc cache: L3 cache driver for HiSilicon SoC

Linus Walleij linusw at kernel.org
Fri Feb 6 00:05:27 PST 2026


On Thu, Feb 5, 2026 at 3:39 PM Arnd Bergmann <arnd at arndb.de> wrote:

> Another similar technology is memory that has already been locked
> by firmware (or hardware design, i.e. not a cache), and there are
> a few I remember:
>
> - drivers/misc/sram.c exports sram from physical addresses to
>   userspace. For a deeply embedded system with known amounts
>   of locked-down L3 cache, the firmware could just pre-lock
>   the cache and expose it to the kernel as an sram.

I think that's possible but not practical. To use the special on-chip
RAMs for speedy execution the affected code (hard kernel) has to
be compiled as a separate PIC file and put into that memory (IIRC).

I think I heard of a platform (OMAP?) that would lock down the L2
cache and use as some kinde of on-chip memory during
retention.

> - your own arch/arm/kernel/tcm.c, which does not currently have
>   any upstream users. I don't remember if it ever had.

Nope, but it is just a very fast SRAM, the only thing that is special
with TCM is that it can actually be moved around in physical
memory (!) by altering some special registers.

> - arch/arc/ had an elaborate downstream patch for wireless
>   network SoCs from (IIRC) Quantenna that would link mark
>   any performance-sensitive .text and .data parts of the network
>   stack to be in on-chip SRAM, but no user interface.

That's how people mostly use SRAM for executable code I
think. Some use it for data too especially things like linked
scatterlists for DMA.

The Huawei patch is different because it doesn't require any
special compilation, instead it identifies the performance-critical
code in userspace, then tell the kernel to lock down those
specific lines. It's a more practical and generic way to deal with
this problem, admittedly.

What this patch doesn't solve at all is the situation when the
*kernel* want to lock down some cache lines for whatever valid
reason. A generic API would make this possible when we
identify some performance-critical code inside the kernel as
well.

Yours,
Linus Walleij



More information about the linux-arm-kernel mailing list