[PATCH] irqchip/gic-v3-its: Mark some in-memory data structures as 'decrypted'

Will Deacon will at kernel.org
Thu Dec 9 01:10:01 PST 2021


On Wed, Dec 08, 2021 at 06:20:36PM +0000, Robin Murphy wrote:
> On 2021-12-08 15:59, Will Deacon wrote:
> > The GICv3 ITS driver allocates memory for its tables using alloc_pages()
> > and performs explicit cache maintenance if necessary. On systems such
> > as those running pKVM, where the memory encryption API is implemented,
> > memory shared with the ITS must first be transitioned to the "decrypted"
> > state, as it would be if allocated via the DMA API.
> > 
> > Allow pKVM guests to interact with an ITS emulation by ensuring that the
> > shared pages are decrypted at the point of allocation and encrypted
> > again upon free().
> > 
> > Cc: Thomas Gleixner <tglx at linutronix.de>
> > Cc: Marc Zyngier <maz at kernel.org>
> > Signed-off-by: Will Deacon <will at kernel.org>
> > ---
> > 
> > Although pKVM doesn't yet expose share/unshare hypercalls to the guest,
> > this change is agnostic of the hypervisor and could be queued
> > independently as it has no functional impact when the memory encryption
> > API is not implemented.
> > 
> >   drivers/irqchip/irq-gic-v3-its.c | 40 ++++++++++++++++++++++++++------
> >   1 file changed, 33 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index eb0882d15366..4559b8dfb9bc 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -27,6 +27,7 @@
> >   #include <linux/of_pci.h>
> >   #include <linux/of_platform.h>
> >   #include <linux/percpu.h>
> > +#include <linux/set_memory.h>
> >   #include <linux/slab.h>
> >   #include <linux/syscore_ops.h>
> > @@ -2166,6 +2167,7 @@ static void gic_reset_prop_table(void *va)
> >   	/* Make sure the GIC will observe the written configuration */
> >   	gic_flush_dcache_to_poc(va, LPI_PROPBASE_SZ);
> > +	set_memory_decrypted((unsigned long)va, LPI_PROPBASE_SZ >> PAGE_SHIFT);
> 
> Hmm, that would seem to imply we've just done the memset() and cache clean
> on encrypted memory... that's not right, surely?

Actually, that's exactly what we want! The *crypted state is only of concern
to the DMA accesses, so the idea is that you clear out / initialiase the
buffer and only then decrypt the memory for the device to access it.

The DMA code does the same sort of thing -- for example, see
atomic_pool_expand() in kernel/dma/pool.c.

> >   }
> >   static struct page *its_allocate_prop_table(gfp_t gfp_flags)
> > @@ -2183,8 +2185,10 @@ static struct page *its_allocate_prop_table(gfp_t gfp_flags)
> >   static void its_free_prop_table(struct page *prop_page)
> >   {
> > -	free_pages((unsigned long)page_address(prop_page),
> > -		   get_order(LPI_PROPBASE_SZ));
> > +	unsigned long va = (unsigned long)page_address(prop_page);
> > +
> > +	set_memory_encrypted(va, LPI_PROPBASE_SZ >> PAGE_SHIFT);
> 
> Similarly to [1], it might be worth at least pretending to consider failure
> in these instances. (Side note, might set_memory_encrypted() logically
> warrant a __must_check annotation?)

There's not much we can do, but it looks like the DMA allocator leaks the
memory rather than freeing it, so I'll add that.

> > @@ -2934,6 +2944,7 @@ static int its_alloc_collections(struct its_node *its)
> >   static struct page *its_allocate_pending_table(gfp_t gfp_flags)
> >   {
> >   	struct page *pend_page;
> > +	void *va;
> >   	pend_page = alloc_pages(gfp_flags | __GFP_ZERO,
> >   				get_order(LPI_PENDBASE_SZ));
> > @@ -2941,14 +2952,19 @@ static struct page *its_allocate_pending_table(gfp_t gfp_flags)
> >   		return NULL;
> >   	/* Make sure the GIC will observe the zero-ed page */
> > -	gic_flush_dcache_to_poc(page_address(pend_page), LPI_PENDBASE_SZ);
> > +	va = page_address(pend_page);
> > +	gic_flush_dcache_to_poc(va, LPI_PENDBASE_SZ);
> > +	set_memory_decrypted((unsigned long)va, LPI_PENDBASE_SZ >> PAGE_SHIFT);
> 
> Again, it looks fundamentally sketchy to decrypt memory *after* already
> doing something to it - under a "real" encryption scheme the cleaned out
> zeros might now appear as a page full of gibberish ciphertext, which would
> not be what you want.
> 
> That said, aren't the pending tables the completely IMP-DEF ones that only
> the ITS itself ever touches? As-is should we even need to "decrypt" them at
> all?

We do, as we need the ITS to be able to access the pages. For pKVM, this
is actually the virtual ITS emulation in the host.

> The overall feeling here is that we're not being particularly consistent
> about whether we're readily leaning on the assumption of "encryption" only
> ever meaning CPU stage 2 protection here, or whether we're trying to uphold
> the illusion of the more general notion with the ITS only being able to
> observe "plaintext" shared memory.

Really, I've just tried to follow the way this is used for DMA: pages which
need to be accessed by the device must first be set as decrypted, and then
set back to encrypted before freeing back to the page allocator. This maps
directly to what we're doing with the stage-2 (i.e. decrypt means share, and
encrypt means unshare) but it would equally apply to real memory encryption
schemes along the lines of what x86, power and s390 are doing.

Will



More information about the linux-arm-kernel mailing list