[PATCH RFC 0/4] mm, arm64: In-kernel support for memory-deny-write-execute (MDWE)

Thu Apr 21 11:33:05 PDT 2022

On Thu, Apr 21, 2022 at 10:41:43AM -0700, Kees Cook wrote:
> On Thu, Apr 21, 2022 at 06:24:21PM +0100, Catalin Marinas wrote:
> > On Thu, Apr 21, 2022 at 09:42:23AM -0700, Kees Cook wrote:
> > > On Thu, Apr 21, 2022 at 04:35:15PM +0100, Catalin Marinas wrote:
> > > > Do we want the "was PROT_WRITE" or we just reject mprotect(PROT_EXEC) if
> > > > the vma is not already PROT_EXEC? The latter is closer to the current
> > > > systemd approach. The former allows an mprotect(PROT_EXEC) if the
> > > > mapping was PROT_READ only for example.
> > > > 
> > > > I'd drop the "was PROT_WRITE" for now if the aim is a drop-in
> > > > replacement for BPF MDWE.
> > > 
> > > I think "was PROT_WRITE" is an important part of the defense that
> > > couldn't be done with a simple seccomp filter (which is why the filter
> > > ended up being a problem in the first place).
> > 
> > I would say "was PROT_WRITE" is slightly more relaxed than "is not
> > already PROT_EXEC". The seccomp filter can't do "is not already
> > PROT_EXEC" either since it only checks the mprotect() arguments, not the
> > current vma flags.
> > 
> > So we have (with sub-cases):
> > 
> > 1. Current BPF filter:
> > 
> >    a)	mmap(PROT_READ|PROT_WRITE|PROT_EXEC);	// fails
> > 
> >    b)	mmap(PROT_READ|PROT_EXEC);
> >		mprotect(PROT_READ|PROT_EXEC|PROT_BTI);	// fails
> >
> >    c)	mmap(PROT_READ);
> >		mprotect(PROT_READ|PROT_EXEC);		// fails
> >
> >    d)	mmap(PROT_READ|PROT_WRITE);
> >		mprotect(PROT_READ);
> >		mprotect(PROT_READ|PROT_EXEC);		// fails
> > 
> > 2. "is not already PROT_EXEC":
> > 
> >    a)	mmap(PROT_READ|PROT_WRITE|PROT_EXEC);	// fails
> > 
> >    b)	mmap(PROT_READ|PROT_EXEC);
> >		mprotect(PROT_READ|PROT_EXEC|PROT_BTI);	// passes
> > 
> >    c)	mmap(PROT_READ);
> >		mprotect(PROT_READ|PROT_EXEC);		// fails
> >
> >    d)	mmap(PROT_READ|PROT_WRITE);
> >		mprotect(PROT_READ);
> >		mprotect(PROT_READ|PROT_EXEC);		// fails
> > 
> > 3. "is or was not PROT_WRITE":
> > 
> >    a)	mmap(PROT_READ|PROT_WRITE|PROT_EXEC);	// fails
> > 
> >    b)	mmap(PROT_READ|PROT_EXEC);
> >		mprotect(PROT_READ|PROT_EXEC|PROT_BTI);	// passes
> > 
> >    c)	mmap(PROT_READ);
> >		mprotect(PROT_READ|PROT_EXEC);		// passes
> > 
> >    d)	mmap(PROT_READ|PROT_WRITE);
> >		mprotect(PROT_READ);
> >	 	mprotect(PROT_READ|PROT_EXEC);		// fails
> 
> [edited above to show each case]

Thanks, I was in a rush to get home ;).

> restated what was already summarized:
> Problem is 1.b. 2 and 3 solve it. 3 is more relaxed (c passes).
> 
> > If we don't care about 3.c, we might as well go for (2). I don't mind,
> > already went for (3) in this series. I think either of them would not be
> > a regression on MDWE, unless there is some test that attempts 3.c and
> > expects it to fail.
> 
> I should stop arguing for a less restrictive mode. ;) It just feels weird
> that the combinations are API-mediated, rather than logically defined:
> I can do PROT_READ|PROT_EXEC with mmap but not mprotect under 2. As
> opposed to saying "the vma cannot be executable if it is or ever was
> writable". I find the latter much easier to reason about as far as the
> expectations of system state.

I had the same reasoning, hence option 3 in this series. I prefer to
treat mmap(PROT_READ|PROT_EXEC) and mprotect(PROT_READ|PROT_EXEC) in a
similar way.

-- 
Catalin