[PATCH v6 6/9] seccomp: add "seccomp" syscall
luto at amacapital.net
Fri Jun 13 14:42:03 PDT 2014
On Fri, Jun 13, 2014 at 2:37 PM, Alexei Starovoitov <ast at plumgrid.com> wrote:
> On Fri, Jun 13, 2014 at 2:25 PM, Andy Lutomirski <luto at amacapital.net> wrote:
>> On Fri, Jun 13, 2014 at 2:22 PM, Alexei Starovoitov <ast at plumgrid.com> wrote:
>>> On Tue, Jun 10, 2014 at 8:25 PM, Kees Cook <keescook at chromium.org> wrote:
>>>> This adds the new "seccomp" syscall with both an "operation" and "flags"
>>>> parameter for future expansion. The third argument is a pointer value,
>>>> used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must
>>>> be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...).
>>>> Signed-off-by: Kees Cook <keescook at chromium.org>
>>>> Cc: linux-api at vger.kernel.org
>>>> arch/x86/syscalls/syscall_32.tbl | 1 +
>>>> arch/x86/syscalls/syscall_64.tbl | 1 +
>>>> include/linux/syscalls.h | 2 ++
>>>> include/uapi/asm-generic/unistd.h | 4 ++-
>>>> include/uapi/linux/seccomp.h | 4 +++
>>>> kernel/seccomp.c | 63 ++++++++++++++++++++++++++++++++-----
>>>> kernel/sys_ni.c | 3 ++
>>>> 7 files changed, 69 insertions(+), 9 deletions(-)
>>>> diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl
>>>> index d6b867921612..7527eac24122 100644
>>>> --- a/arch/x86/syscalls/syscall_32.tbl
>>>> +++ b/arch/x86/syscalls/syscall_32.tbl
>>>> @@ -360,3 +360,4 @@
>>>> 351 i386 sched_setattr sys_sched_setattr
>>>> 352 i386 sched_getattr sys_sched_getattr
>>>> 353 i386 renameat2 sys_renameat2
>>>> +354 i386 seccomp sys_seccomp
>>>> diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
>>>> index ec255a1646d2..16272a6c12b7 100644
>>>> --- a/arch/x86/syscalls/syscall_64.tbl
>>>> +++ b/arch/x86/syscalls/syscall_64.tbl
>>>> @@ -323,6 +323,7 @@
>>>> 314 common sched_setattr sys_sched_setattr
>>>> 315 common sched_getattr sys_sched_getattr
>>>> 316 common renameat2 sys_renameat2
>>>> +317 common seccomp sys_seccomp
>>>> # x32-specific system call numbers start at 512 to avoid cache impact
>>>> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>>>> index b0881a0ed322..1713977ee26f 100644
>>>> --- a/include/linux/syscalls.h
>>>> +++ b/include/linux/syscalls.h
>>>> @@ -866,4 +866,6 @@ asmlinkage long sys_process_vm_writev(pid_t pid,
>>>> asmlinkage long sys_kcmp(pid_t pid1, pid_t pid2, int type,
>>>> unsigned long idx1, unsigned long idx2);
>>>> asmlinkage long sys_finit_module(int fd, const char __user *uargs, int flags);
>>>> +asmlinkage long sys_seccomp(unsigned int op, unsigned int flags,
>>>> + const char __user *uargs);
>>> It looks odd to add 'flags' argument to syscall that is not even used.
>>> It don't think it will be extensible this way.
>>> 'uargs' is used only in 2nd command as well and it's not 'char __user *'
>>> but rather 'struct sock_fprog __user *'
>>> I think it makes more sense to define only first argument as 'int op' and the
>>> rest as variable length array.
>>> Something like:
>>> long sys_seccomp(unsigned int op, struct nlattr *attrs, int len);
>>> then different commands can interpret 'attrs' differently.
>>> if op == mode_strict, then attrs == NULL, len == 0
>>> if op == mode_filter, then attrs->nla_type == seccomp_bpf_filter
>>> and nla_data(attrs) is 'struct sock_fprog'
>> Eww. If the operation doesn't imply the type, then I think we've
>> totally screwed up.
>>> If we decide to add new types of filters or new commands, the syscall prototype
>>> won't need to change. New commands can be added preserving backward
>>> The basic TLV concept has been around forever in netlink world. imo makes
>>> sense to use it with new syscalls. Passing 'struct xxx' into syscalls
>>> is the thing
>>> of the past. TLV style is more extensible. Fields of structures can become
>>> optional in the future, new fields added, etc.
>>> 'struct nlattr' brings the same benefits to kernel api as protobuf did
>>> to user land.
>> I see no reason to bring nl_attr into this.
>> Admittedly, I've never dealt with nl_attr, but everything
>> netlink-related I've even been involved in has involved some sort of
>> API atrocity.
> netlink has a lot of legacy and there is genetlink which is not pretty
> either because of extra socket creation, binding, dealing with packet
> loss issues, but the key concept of variable length encoding is sound.
> Right now seccomp has two commands and they already don't fit
> into single syscall neatly. Are you saying there should be two syscalls
> here? What about another seccomp related command? Another syscall?
> imo all seccomp related commands needs to be mux/demux-ed under
> one syscall. What is the way to mux/demux potentially very different
> commands under one syscall? I cannot think of anything better than
> TLV style. 'struct nlattr' is what we have today and I think it works fine.
> I'm not suggesting to bring the whole netlink into the picture, but rather
> TLV style of encoding different arguments for different commands.
I'm unconvinced. These are simple commands, and I think the interface
should be simple. Syscalls are cheap.
As an example, the interface could be:
int seccomp_add_filter(const struct sock_fprog *filter, unsigned int flags);
The "tsync" operation would be seccomp_add_filter(NULL,
SECCOMP_ADD_FILTER_TSYNC) -- it's equivalent to adding an
always-accept filter and syncing threads.
But, frankly, this kind of stuff should probably be "do operation X".
IIUC nl_attr is more like "do something, with these tags and values",
which results in oddities like whatever should happen of more than one
tag is set.
AMA Capital Management, LLC
More information about the linux-arm-kernel