Re: [asahilinux:nvme/dev 13/17] drivers/nvme/host/pci.c:2249:2-3: Unneeded semicolon

Sven Peter sven at svenpeter.dev
Tue Aug 17 23:26:29 PDT 2021


Alright, I've observed what macOS does by using the simple hypervisor we
built and tracing its MMIO access. There actually is a single entry per page
in both the TCB struct and the command queue and all entries are used.

The reason for that needs a little more background around XNU and it's
security architecture works:

On these machines, XNU is split into two parts: The main kernel with all its
extensions and hardware drivers, and a small section called "Page Protection Layer"
or PPL.
These machines have CPU extensions that allow them to prevent the normal kernel
from writing to pagetables or accessing any MMIO related to IOMMUs. Switching to
the PPL section is done with a custom instruction ("genter") which changes memory
permissions, such that PPL is then able to modify pagetables and configures the 
IOMMUs. It kinda works like a low-overhead hypervisor that controls pagetables.
There are some writeups available about this if you are curious about
the details [1][2][3].

The TCB structs and the NVMMU MMIO registers cannot be accessed by the part of the
kernel that contains the NVME driver. The NVME driver can only prepare the
command structure and fill the PRP list with entries for the DMA buffers.
It then calls out to PPL, which verifies all the pages listed inside the PRP
are allowed for DMA and then constructs the same structure again inside
protected memory that can no longer be touched by the regular kernel.

Then the NVMMU is configured with this secondary PRP list from inside PPL
before it returns back control to the NVME driver. Effectively,
this prevents someone from breaking into the normal kernel to just DMA
over any buffer they want.

For Linux we can ignore all this and just point the NVMMU and the
queue entry to the same PRP list.



Sven


[1] Jonathan Levin's writeup about the Page Protection Layer http://newosxbook.com/articles/CasaDePPL.html
[2] siguza's writeup about how this was done on iOS https://siguza.github.io/APRR/
[3] my writeup about the CPU extensions https://blog.svenpeter.dev/posts/m1_sprr_gxf/


On Mon, Aug 9, 2021, at 22:11, Sven Peter wrote:
> 
> 
> On Mon, Aug 9, 2021, at 17:53, Arnd Bergmann wrote:
> > On Mon, Aug 9, 2021 at 4:29 PM Christoph Hellwig <hch at infradead.org> wrote:
> > > Also can one of you look how PRPs are actually used by MacOS?  Given
> > > that this device always seems to be behind a IOMMU creating one entry
> > > per page seems rather weird given that the apple_nvmmu_tcb structure
> > > already contains the full length.  Maybe it actually ignores all but
> > > the first PRP?
> > 
> > I'll leave this up to Sven to answer. He also wrote the iommu driver,
> > so he probably has a good idea of what is going on here already.
> > 
> >       Arnd
> > 
> 
> Not yet, but figuring out how this NVMe-IOMMU works in detail was
> already on my TODO list :-)
> 
> Some background - the M1 has at least four different IOMMU-like
> HW blocks:
> DART (for which I wrote a driver and where I'd actually know what's going
> on in detail), SART (simple DMA address filter required by the NVMe
> co-processor for non-nvme transactions), this weird NVMe IOMMU (that
> also seems to be somehow related to disk encryption) and GART for their GPU.
> 
> 
> 
> Sven
> 


-- 
Sven Peter



More information about the Linux-nvme mailing list