[PATCH v5 3/5] x86: Split syscall_trace_enter into two phases

Fri Feb 6 12:12:41 PST 2015

On Fri, Feb 6, 2015 at 12:07 PM, Kees Cook <keescook at chromium.org> wrote:
> On Fri, Feb 6, 2015 at 11:32 AM, Andy Lutomirski <luto at amacapital.net> wrote:
>> On Fri, Feb 6, 2015 at 11:23 AM, Kees Cook <keescook at chromium.org> wrote:
>>> And especially since a ptracer
>>> can change syscalls during syscall-enter-stop to any syscall it wants,
>>> bypassing seccomp. This condition is already documented.
>>
>> If a ptracer (using PTRACE_SYSCALL) were to get the entry callback
>> before seccomp, then this oddity would go away, which might be a good
>> thing.  A ptracer could change the syscall, but seccomp would based on
>> what the ptracer changed the syscall to.
>
> I want kill events to trigger immediately. I don't want to leave the
> ptrace surface available on a SECCOMP_RET_KILL. So maybe it can be
> seccomp phase 1, then ptrace, then seccomp phase 2? And pass more
> information between phases to determine how things should behave
> beyond just "skip"?

I thought so too, originally, but I'm far less convinced now, for two reasons:

1. I think that a lot of filters these days use RET_ERRNO heavily, so
this won't benefit them.

2. I'm not convinced it really reduces the attack surface for anyone.
Unless your filter is literally "return SECCOMP_RET_KILL", then the
seccomp-filtered task can always cause the ptracer to get a pair of
syscall notifications.  Also, the task can send itself signals (using
page faults, breakpoints, etc) and cause ptrace events via other
paths.

--Andy