[PATCHv2 2/2] x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default

Kirill A. Shutemov kirill.shutemov at linux.intel.com
Tue Dec 10 05:26:40 PST 2024


On Thu, Nov 21, 2024 at 12:49:52PM +0100, Borislav Petkov wrote:
> On Tue, Nov 19, 2024 at 10:21:05AM +0200, Kirill A. Shutemov wrote:
> > Sure, we can workaround every place that touches such ranges.
> 
> Every place? Which every place? I thought this is only an EISA issue?

I looked at other places where we call memremap(MEMREMAP_WB) such as
acpi_wakeup_cpu(). We actually get encrypted/private mapping for this
callsite despite __ioremap_caller() being called encrypted == false.
This happens because of IORES_MAP_ENCRYPTED check in __ioremap_caller().

So we depend on the BIOS here. The EISA problem happens because the
target memory is in !IORES_MAP_ENCRYPTED memory.

It's hard to say if any other memremap(MEMREMAP_WB) would trigger the
issue. And what will happen after next BIOS update.

> Then clearly your changelogs need to expand considerably more what we're
> *really* addressing here.
> 
> > Or we can address problem at the root and make creating decrypted/shared
> > mappings explicit.
> 
> What is the problem? That KVM implicitly converts memory to shared? Why does
> KVM do that an can it be fixed not to?
> 
> Doesn't sound like the guest's problem.

Well, the problem is on the both sides.

VMM behaviour on such accesses is not specified in any spec. AFAIK all
current VMM implementations do this implicit conversion.

I think it has to be fixed. VMMs (not only KVM) should not silently
convert memory to shared. But VMMs cannot make memory access to go away.
The only option they have is to inject #VE instead indicating bogus
access. At this point it becomes a guest problem.

It will get fixed in VMMs naturally when TDX Connect gets enabled.
With a secure device assigned to a TD, VMM would loose the ability to
convert memory on its own. The guest would have to unlock the memory
first. This will make implicit conversion impossible.

But it also means guest should never initiate shared access without
explicit conversion. Otherwise #VE will crash it.

> Or maybe this needs a lot more explanation what we're fixing here.
> 
> > Such mappings have both functional (as we see here) and security
> > implications (VMM can manipulate the guest memory range). We should not
> > create decrypted mappings by default on legacy interfaces.
> 
> So we're getting closer.
> 
> The changes themselves are fine but your text is missing a lot about what
> we're fixing here. When I asked, I barely scratched the surface. So can we
> elaborate here pls?

What about this:

x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default

Currently memremap(MEMREMAP_WB) can produce decrypted/shared mapping:

memremap(MEMREMAP_WB)
  arch_memremap_wb()
    ioremap_cache()
      __ioremap_caller(.encrytped = false)

In such cases, the IORES_MAP_ENCRYPTED flag on the memory will determine
if the resulting mapping is encrypted or decrypted.

Creating a decrypted mapping without explicit request from the caller is
risky:

  - It can inadvertently expose the guest's data and compromise the
    guest.

  - Accessing private memory via shared/decrypted mapping on TDX will
    either trigger implicit conversion to shared or #VE (depending on
    VMM implementation).

    Implicit conversion is destructive: subsequent access to the same
    memory via private mapping will trigger a hard-to-debug #VE crash.

The kernel already provides a way to request decrypted mapping
explicitly via the MEMREMAP_DEC flag.

Modify memremap(MEMREMAP_WB) to produce encrypted/private mapping by
default unless MEMREMAP_DEC is specified.

This change fixes the crash on kexec in TDX guests if CONFIG_EISA is
enabled.
 
-- 
  Kiryl Shutsemau / Kirill A. Shutemov



More information about the linux-riscv mailing list