[RFC PATCH 05/13] x86/um: nommu: syscall translation by zpoline
Hajime Tazaki
thehajime at gmail.com
Sat Oct 26 00:36:03 PDT 2024
On Sat, 26 Oct 2024 00:20:49 +0900,
Johannes Berg wrote:
>
> On Fri, 2024-10-25 at 21:58 +0900, Hajime Tazaki wrote:
> >
> > > > + if (down_write_killable(&mm->mmap_lock)) {
> > > > + err = -EINTR;
> > > > + return err;
> > >
> > > ?
> >
> > the lock isn't needed actually so, will remove it.
>
> Oh, I was just looking at the weird handling of the err variable :)
Ah, now I see. I'd revert the lock part with `return -EINTR` instead.
> > > What happens if the binary JITs some code and you don't find it? I don't
> > > remember from your talk - there you seemed to say this was fine just
> > > slow, but that was zpoline in a different context (container)?
> >
> > instructions loaded after execve family (like JIT generated code,
> > loaded with dlopen, etc) isn't going to be translated. we can
> > translated it by tweaking the userspace loader (ld.so w/ LD_PRELOAD)
> > or hook mprotect(2) syscall before executing JIT generated code.
> > generic description is written in the document ([12/13]).
>
> Guess I should've read that, sorry.
no no, since this part is completely new feature and I'd like to
explain any unclear points to help understanding, so any inputs are
always nice.
# btw, the talk at last netdev was not container specific context, but
more focus on the syscall hook mechanism itself so, I didn't go much
detail at that time.
> > > Perhaps UML could additionally install a seccomp filter or something on
> > > itself while running a userspace program? Hmm.
> >
> > I'm trying to understand the purpose of seccomp filter you suggested
> > here; is it for preventing executed by untranslated code ?
>
> Yeah, that's what I was wondering.
>
> Obviously you have to be able to get rid of the seccomp filter again so
> it's not foolproof, but perhaps not _that_ bad?
>
> I'm not worried about security or so, it's clear this isn't even _meant_
> to have security. But I do wonder about really hard to debug issues if
> userspace suddenly makes syscalls to the host, that'd be ... difficult
> to understand?
I totally understand; I faced similar situations during the developing
this patchset.
Originally our patchset had a whitelist-based seccomp filter (w/
SCMP_ACT_ALLOW), but dropped from this RFC as I found that 1) this is
not the !MMU specific feature (it can be generally applied to all UML
use cases), and 2) we cannot prevent a syscall (e.g., ioctl(2)) from
userspace which is white-listed in our seccomp filter, thus the newly
introduced filter may not be perfect.
the maintenance of the whitelist is also not easy; the syscall used in
one version is renamed at some point in future (what I faced is
SCMP_SYS(open) should be renamed with SCMP_SYS(openat)).
-- Hajime
More information about the linux-um
mailing list