[PATCH 3/6] mm: introduce secretmemfd system call to create "secret" memory areas

James Bottomley jejb at linux.ibm.com
Mon Jul 20 11:51:45 EDT 2020


On Mon, 2020-07-20 at 13:30 +0200, Arnd Bergmann wrote:
> On Mon, Jul 20, 2020 at 11:25 AM Mike Rapoport <rppt at kernel.org>
> wrote:
> > 
> > From: Mike Rapoport <rppt at linux.ibm.com>
> > 
> > Introduce "secretmemfd" system call with the ability to create
> > memory areas visible only in the context of the owning process and
> > not mapped not only to other processes but in the kernel page
> > tables as well.
> > 
> > The user will create a file descriptor using the secretmemfd system
> > call where flags supplied as a parameter to this system call will
> > define the desired protection mode for the memory associated with
> > that file descriptor. Currently there are two protection modes:
> > 
> > * exclusive - the memory area is unmapped from the kernel direct
> > map and it
> >               is present only in the page tables of the owning mm.
> > * uncached  - the memory area is present only in the page tables of
> > the
> >               owning mm and it is mapped there as uncached.
> > 
> > For instance, the following example will create an uncached mapping
> > (error handling is omitted):
> > 
> >         fd = secretmemfd(SECRETMEM_UNCACHED);
> >         ftruncate(fd, MAP_SIZE);
> >         ptr = mmap(NULL, MAP_SIZE, PROT_READ | PROT_WRITE,
> > MAP_SHARED,
> >                    fd, 0);
> > 
> > Signed-off-by: Mike Rapoport <rppt at linux.ibm.com>
> 
> I wonder if this should be more closely related to dmabuf file
> descriptors, which are already used for a similar purpose: sharing
> access to secret memory areas that are not visible to the OS but can
> be shared with hardware through device drivers that can import a
> dmabuf file descriptor.

I'll assume you mean the dmabuf userspace API?  Because the kernel API
is completely device exchange specific and wholly inappropriate for
this use case.

The user space API of dmabuf uses a pseudo-filesystem.  So you mount
the dmabuf file type (and by "you" I mean root because an ordinary user
doesn't have sufficient privilege).  This is basically because every
dmabuf is usable by any user who has permissions.  This really isn't
the initial interface we want for secret memory because secret regions
are supposed to be per process and not shared (at least we don't want
other tenants to see who's using what).

Once you have the fd, you can seek to find the size, mmap, poll and
ioctl it.  The ioctls are all to do with memory synchronization (as
you'd expect from a device backed region) and the mmap is handled by
the dma_buf_ops, which is device specific.  Sizing is missing because
that's reported by the device not settable by the user.

What we want is the ability to get an fd, set the properties and the
size and mmap it.  This is pretty much a 100% overlap with the memfd
API and not much overlap with the dmabuf one, which is why I don't
think the interface is very well suited.

James




More information about the linux-riscv mailing list