[RFC PATCH] Documentation/arm64: describe the kernel's expectations of 'memory'

Jonathan Cameron Jonathan.Cameron at Huawei.com
Mon May 17 05:17:25 PDT 2021


On Mon, 17 May 2021 13:55:16 +0200
Ard Biesheuvel <ardb at kernel.org> wrote:

> On Mon, 17 May 2021 at 13:30, Jonathan Cameron
> <Jonathan.Cameron at huawei.com> wrote:
> >
> > On Mon, 17 May 2021 11:33:19 +0100
> > James Morse <james.morse at arm.com> wrote:
> >  
> > > Standards such as CXL allow memory on PCIe devices to be made
> > > available to the operating system for use as regular memory.
> > >
> > > Document linux's expectations around the behaviour of memory as the
> > > implementations of these new standards may need special treatment in
> > > the OS, firmware or bootloader.
> > >
> > > Signed-off-by: James Morse <james.morse at arm.com>  
> >
> > Hi James,
> >
> > +CC linux-cxl to pick up a few more interesting people who might loose
> > this in the wash of linux-arm-kernel
> >
> > Good to see this description as there has been some confusion on this
> > point. This basically looks like what I'd expect to see. Just a few
> > comments around firmware description towards the end.
> >  
> > > ---
> > >  Documentation/arm64/memory.rst | 31 +++++++++++++++++++++++++++++++
> > >  1 file changed, 31 insertions(+)
> > >
> > > diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst
> > > index 901cd094f4ec..951802aee55f 100644
> > > --- a/Documentation/arm64/memory.rst
> > > +++ b/Documentation/arm64/memory.rst
> > > @@ -167,3 +167,34 @@ from a 52-bit space by enabling the following kernel config options:
> > >
> > >  Note that this option is only intended for debugging applications
> > >  and should not be used in production.
> > > +
> > > +On device memory used as regular memory
> > > +---------------------------------------
> > > +Standards such as CXL allow memory on PCIe device to be made
> > > +available to the operating system for use as regular memory.
> > > +
> > > +If memory is added to the UEFI memory map or DT, or discovered via ACPI's SRAT,
> > > +linux expects it to function in the same way as the bulk DRAM. This section  
> >
> > Linux
> >  
> > > +terms this 'regular memory'.
> > > +
> > > +The kernel may use any attributes to map this memory, e.g. Device-nGnRnE or
> > > +Normal Writeback-Cacheable. The kernel may not be in control of the attributes
> > > +used, e.g. if the memory is used by a KVM guest.
> > > +The kernel will perform cache maintenance to resolve mismatched attributes,
> > > +e.g. invalidating clean stale lines after writing new data when the MMU is
> > > +disabled.
> > > +
> > > +The memory may be used by any instruction supported by the CPUs.
> > > +e.g. Even when the v8.1 LSE atomic instructions are supported, the v8.0
> > > +exclusives are still used for the futex code, and conditional waits, and still
> > > +used by existing user-space binaries. When the CPUs support features such as
> > > +MTE, all regular memory must support MTE tags.
> > > +
> > > +On device memory that does not function in the same way as regular memory must
> > > +not be added to the UEFI memory map or DT, or be discovered via ACPI's SRAT.
> > > +
> > > +On arm64, the kernel does not rewrite the UEFI memory map when memory is added
> > > +or removed. On device memory that is present at boot, but must be removed later  
> >
> > Might be worth giving an example of why memory 'must be removed'?  I'm not sure
> > what you are getting at there.  Specific purpose memory?
> >  
> > > +should be discovered via ACPI's SRAT to ensure it is not used for non-movable
> > > +structures.  
> >
> > Not sure I follow this part.  It could be of type EFI_MEMORY_SP.  
> 
> EFI_MEMORY_SP is an attribute, not a type.

Good point.

> 
> > It should be in SRAT as well, but the EFI type should be sufficient to avoid
> > problems.
> > "The SPM attribute serves as a hint to the OS to avoid allocating this memory
> >  for core OS data or code that can not be relocated."
> >
> > Now I'm not sure the kernel is handling EFI_MEMORY_SP fully yet...  If
> > we need to exclude this approach for now, then this text should perhaps
> > call it out explicitly.
> >  
> 
> The problem with EFI_MEMORY_SP is that it is not a type, but an
> attribute,  which gives a hint to the OS about the nature of the
> memory, which the OS is free to ignore.

IIRC the way around that is to use the reserved type + EFI_MEMORY_SP.
An unware bootloader or OS will then not use it and hence we are safe.
An aware driver can then decide it is safe to "hotplug" said memory.

> 
> The UEFI memory map is not only consumed by the OS, but by any driver
> or OS loader that executes in the EFI boot environment, e.g., GPU
> drivers or shim/grub bootloaders. If these are not enlightened and
> understand what EFI_MEMORY_SP means, they may (and are entitled to)
> treat this EFI_MEMORY_SP as if it were regular memory. If GRUB loads
> the kernel into EFI_MEMORY_SP memory, it had better behave like
> regular memory or things will fall apart.

Two separate issues here. The 'broken' one where _SP or indeed
hotplug flag is no use, and the one where it is 'must be removed later'
and we just don't want to put unmovable allocations in it.

> 
> This means that EFI_MEMORY_SP is really only suitable to describe
> aspects of the memory range that can be happily ignored. MTE or
> atomics capability must be described in a different way.
> 

That's indeed the intent. These are just hints and indeed not suitable for
the cases where things are broken (MTE / Atomics).  In those you
should not be claiming it is normal memory at all.  SRAT doesn't help
you with that though.

The hotplug flag is SRAT is also only a hint. OS doesn't have
to take any notice or support it nor does any boot loader.  Things
will 'work' with the exception of hot-remove.  If you definitely don't
want your memory to be used by the OS for normal purposes, then
don't present it in a form where it might be.

Jonathan



> 
> 
> > > +e.g. the kernel text, page tables or the GIC ITS Pending Table.  
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel  




More information about the linux-arm-kernel mailing list