[PATCH RFC 0/4] mm, arm64: In-kernel support for memory-deny-write-execute (MDWE)

Topi Miettinen toiwoton at gmail.com
Wed Apr 13 11:39:37 PDT 2022


On 13.4.2022 16.49, Catalin Marinas wrote:
> Hi,
> 
> The background to this is that systemd has a configuration option called
> MemoryDenyWriteExecute [1], implemented as a SECCOMP BPF filter. Its aim
> is to prevent a user task from inadvertently creating an executable
> mapping that is (or was) writeable. Since such BPF filter is stateless,
> it cannot detect mappings that were previously writeable but
> subsequently changed to read-only. Therefore the filter simply rejects
> any mprotect(PROT_EXEC). The side-effect is that on arm64 with BTI
> support (Branch Target Identification), the dynamic loader cannot change
> an ELF section from PROT_EXEC to PROT_EXEC|PROT_BTI using mprotect().
> For libraries, it can resort to unmapping and re-mapping but for the
> main executable it does not have a file descriptor. The original bug
> report in the Red Hat bugzilla - [2] - and subsequent glibc workaround
> for libraries - [3].
> 
> Add in-kernel support for such feature as a DENY_WRITE_EXEC personality
> flag, inherited on fork() and execve(). The kernel tracks a previously
> writeable mapping via a new VM_WAS_WRITE flag (64-bit only
> architectures). I went for a personality flag by analogy with the
> READ_IMPLIES_EXEC one. However, I'm happy to change it to a prctl() if
> we don't want more personality flags. A minor downside with the
> personality flag is that there is no way for the user to query which
> flags are supported, so in patch 3 I added an AT_FLAGS bit to advertise
> this.

With systemd there's a BPF construct to block personality changes 
(LockPersonality=yes) but I think prctl() would be easier to lock down 
irrevocably.

Requiring or implying NoNewPrivileges could prevent nasty surprises from 
set-uid Python programs which happen to use FFI.

> Posting this as an RFC to start a discussion and cc'ing some of the
> systemd guys and those involved in the earlier thread around the glibc
> workaround for dynamic libraries [4]. Before thinking of upstreaming
> this we'd need the systemd folk to buy into replacing the MDWE SECCOMP
> BPF filter with the in-kernel one.

As the author of this feature in systemd (also similar feature in 
Firejail), I'd highly prefer in-kernel version to BPF protection. I'd 
definitely also want to use this in place of BPF on x86_64 and other 
arches too.

In-kernel version would probably allow covering pretty easily this case 
(maybe it already does):

	fd = memfd_create(...);
	write(fd, malicious_code, sizeof(malicious_code));
	mmap(..., PROT_EXEC, ..., fd);

Other memory W^X implementations include S.A.R.A [1] and SELinux 
EXECMEM/EXECSTACK/EXECHEAP protections [2], [3]. SELinux checks 
IS_PRIVATE(file_inode(file)) and vma->anon_vma != NULL, which might be 
useful additions here too (or future extensions if you prefer).

-Topi

[1] https://smeso.it/sara/
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/security/selinux/hooks.c#n3708
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/security/selinux/hooks.c#n3787



More information about the linux-arm-kernel mailing list