[PATCH v3 1/2] arm64: Support execute-only permissions with Enhanced PAN

Tue Jan 26 07:23:31 EST 2021

On Tue, Jan 26, 2021 at 12:05:42PM +0000, Catalin Marinas wrote:
> On Tue, Jan 26, 2021 at 11:03:06AM +0000, Will Deacon wrote:
> > On Tue, Jan 19, 2021 at 04:07:22PM +0000, Vladimir Murzin wrote:
> > >  #define pte_valid_not_user(pte) \
> > > -	((pte_val(pte) & (PTE_VALID | PTE_USER)) == PTE_VALID)
> > > +	((pte_val(pte) & (PTE_VALID | PTE_USER | PTE_UXN)) == (PTE_VALID | PTE_UXN))
> > >  #define pte_valid_user(pte) \
> > >  	((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER))
> > 
> > Won't pte_valid_user() go wrong for exec-only mappings now?
> 
> I wondered about the same in November (and reproducing my comment
> below):
> 
> https://lore.kernel.org/linux-arm-kernel/20201119182236.GN4376@gaia/
> 
> pte_valid_user() checks for both PTE_VALID and PTE_USER. In theory, a
> !PTE_UXN is also user-accessible but it's only used in gup_pte_range()
> via pte_access_permitted(). If "access" in the latter means only
> read/write, we should be ok. If we keep it as is, a comment would
> be useful.

I think we should ensure that:

  pte_valid(pte) && (pte_valid_not_user(pte) || pte_valid_user(pte))

is always true, but I don't think it is after this patch. It reminds me
of some of the tests that Anshuman wrote.

> > > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> > > index 8b5e7e5..47e9fdc 100644
> > > --- a/arch/arm64/include/asm/sysreg.h
> > > +++ b/arch/arm64/include/asm/sysreg.h
> > > @@ -591,6 +591,7 @@
> > >  	(SCTLR_EL2_RES1 | ENDIAN_SET_EL2)
> > >  
> > >  /* SCTLR_EL1 specific flags. */
> > > +#define SCTLR_EL1_EPAN		(BIT(57))
> > >  #define SCTLR_EL1_ATA0		(BIT(42))
> > >  
> > >  #define SCTLR_EL1_TCF0_SHIFT	38
> > > @@ -631,7 +632,7 @@
> > >  	 SCTLR_EL1_SED  | SCTLR_ELx_I    | SCTLR_EL1_DZE  | SCTLR_EL1_UCT   | \
> > >  	 SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN | SCTLR_ELx_ITFSB | \
> > >  	 SCTLR_ELx_ATA  | SCTLR_EL1_ATA0 | ENDIAN_SET_EL1 | SCTLR_EL1_UCI   | \
> > > -	 SCTLR_EL1_RES1)
> > > +	 SCTLR_EL1_EPAN | SCTLR_EL1_RES1)
> > 
> > Why is this handled differently to normal PAN, where the SCTLR is written in
> > cpu_enable_pan()?
> 
> That's how it was done in an early version but I suggested we move it to
> the default SCTLR bits to save some lines:
> 
> https://lore.kernel.org/linux-arm-kernel/20201202182303.GC21091@gaia/
> 
> With classic PAN, we only enable it if all the CPUs support it. For EPAN
> that's not necessary since EPAN should depend on PAN being enabled.

Ok, let's just hope we don't have any CPU errata requiring us to disable
it.

> It's also cached in the TLB, so it's a bit of a hassle with CnP. See
> this past discussion:
> 
> https://lore.kernel.org/linux-arm-kernel/X7P+r%2Fl3ewvaf1aV@trantor/

Thanks. None of this is in the Arm ARM afaict, so I can't figure this stuff
out on my own.

> > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > > index 3c40da4..c32095f6 100644
> > > --- a/arch/arm64/mm/fault.c
> > > +++ b/arch/arm64/mm/fault.c
> > > @@ -529,6 +529,9 @@ static int __kprobes do_page_fault(unsigned long far, unsigned int esr,
> > >  	if (faulthandler_disabled() || !mm)
> > >  		goto no_context;
> > >  
> > > +	if (cpus_have_const_cap(ARM64_HAS_EPAN))
> > > +		vm_flags &= ~VM_EXEC;
> > 
> > This needs a comment, and I think it would be cleaner to rework the vm_flags
> > initialisation along the lines of:
> > 
> > 	unsigned long vm_flags;
> > 	unsigned long mm_flags = FAULT_FLAG_DEFAULT;
> > 
> > 	if (user_mode(regs))
> > 		mm_flags |= FAULT_FLAG_USER;
> > 
> > 	if (is_el0_instruction_abort(esr)) {
> > 		vm_flags = VM_EXEC;
> > 		mm_flags |= FAULT_FLAG_INSTRUCTION;
> > 	} else if (is_write_abort(esr)) {
> > 		vm_flags = VM_WRITE;
> > 		mm_flags |= FAULT_FLAG_WRITE;
> > 	} else {
> > 		vm_flags = VM_READ | VM_WRITE;
> > 		if (!cpus_have_const_cap(ARM64_HAS_EPAN))
> > 			vm_flags |= VM_EXEC:
> > 	}
> > 
> > (but again, please add a comment to that last part because I still don't
> > really follow what you're doing)
> 
> Every time I come across this vm_flags handling I have to spend some
> minutes to re-understand what it does. So vm_flags tells us what bits we
> must have in vma->vm_flags for the fault to be benign. But we start with
> all rwx flags and clear VM_EXEC if EPAN since exec does not imply read,
> making it more confusing.
> 
> I think your rewrite is cleaner, maybe with some comment why we add
> VM_EXEC when !EPAN.

Agreed.

Will