[PATCH] arm64: mm: Pass physical address explicitly in map_range

Mon Mar 18 05:31:01 PDT 2024

On Mon, 18 Mar 2024 at 12:12, Pingfan Liu <piliu at redhat.com> wrote:
>
> On Fri, Mar 15, 2024 at 7:39 PM Ard Biesheuvel <ardb at kernel.org> wrote:
> >
> > On Thu, 14 Mar 2024 at 13:53, Pingfan Liu <piliu at redhat.com> wrote:
> > >
> > > This patch is not a fix, just a improvement in code reading.
> > >
> > > At the present, the deduction of a symbol's physical address hides in
> > > the compiler's trick. Introduce a function paddr(), which make the
> > > process explicitly at the sacrifice of one sub and one add instruction.
> > >
> > > Signed-off-by: Pingfan Liu <piliu at redhat.com>
> > > Cc: Catalin Marinas <catalin.marinas at arm.com>
> > > Cc: Will Deacon <will at kernel.org>
> > > Cc: Ard Biesheuvel <ardb at kernel.org>
> > > Cc: Kees Cook <keescook at chromium.org>
> > > To: linux-arm-kernel at lists.infradead.org
> > > ---
> > >  arch/arm64/kernel/pi/map_kernel.c | 30 +++++++++++++++---------------
> > >  arch/arm64/kernel/pi/map_range.c  | 21 +++++++++++++++++++--
> > >  arch/arm64/kernel/pi/pi.h         |  1 +
> > >  3 files changed, 35 insertions(+), 17 deletions(-)
> > >
...
> >
> > Hello Pingfan,
> >
> > I struggle to see the point of this patch. create_init_idmap() and
> > map_kernel() are only called from the early startup code, where the
> > code is mapped 1:1. Those routines are essentially position dependent,
> > just not in the conventional way.
> >
>
> From what angle, can it be considered as position dependent code?
> Could you enlighten me so I can have a refreshed understanding of this
> stuff.
>

There are basically two kinds of position independent code:
a) relocatable code, which contains statically initialized pointer
variables (which are absolute) but with metadata that allows a loader
to fix up those references to be correct for the actual placement of
the code;
b) true position independent code, which does not use any absolute
references at all, and can therefore execute at any offset

Toolchains always generate a), even with -fPIC/-fPIE etc, which mostly
control optimizations/codegen restrictions that allow shared libraries
and PIE executables to be placed at arbitrary offsets in memory
without the need for all code and r/o data sections to be updatable
(for relocation fixups).

In the kernel startup code, we use b), i.e., we generate code using
PC-relative symbol references, and explicitly forbid the use of
R_AARCH64_ABS64 relocations so that fixups are never needed. Since
toolchains generate a), we need to use a special tool (relacheck) for
this. The resulting code can execute anywhere in memory, and variable
reads and writes will work as expected, given that they are
PC-relative.

This code is in charge of creating the initial 1:1 mapping of memory,
and therefore, it needs the 1:1 mapped addresses of _text and _end,
etc. Given that we only permit PC-relative references in this code,
the only way it can produce the correct 1:1 address of those symbols
is if the code itself is mapped 1:1 too. In other words, calling the
same code again from the kernel's normal virtual mapping would produce
a bogus mapping. This does not matter, though: this code has a
specific purpose at early boot, and shouldn't be used afterwards.

> > You are adding a function that has the same properties - your paddr()
> > function is position dependent, and will only return the correct PA of
> > its argument if it is called from a 1:1 mapping of the code.
> >
>
> Is the 1:1 mapping the angle?
>

I don't understand this question.

> > Both existing routines call map_range(), which is position
> > independent, and can (and is) used from other contexts as well. All
> > code built under the pi/ subdirectory is built with instrumentation
> > disabled and without absolute symbol references, making it safe for
> > this particular routine to be called from elsewhere. This is also what
> > guarantees that the references to _text are using the correct
> > translation, as only PC-relative references are permitted.
> >
>
> Exactly, it ensures only PC-relative references.
>

Yes, but paddr() suggests that the result is the physical address. But
this is only the case if paddr() itself runs from a 1:1 mapping of
memory, otherwise it will be based on whatever other translation is
being used.

> > If there is confusion about this, feel free to propose improved
> > documentation. But these code changes only make things worse imo.
> >
>
> Does this look so straightforward for senior developers? Or I can send
> out some documents for it.
>

I cannot speak for other developers. I will note that this code is
highly bespoke for the early execution context where the startup code
lives. Good documentation is always better, but that doesn't mean this
code could or should be used more widely.