[PATCH 01/23] all: syscall wrappers: add documentation

Heiko Carstens heiko.carstens at de.ibm.com
Thu May 26 23:03:57 PDT 2016


> > > The cost is pretty trivial though. See kernel/compat_wrapper.o:
> > > COMPAT_SYSCALL_WRAP2(creat, const char __user *, pathname, umode_t, mode);
> > > 0:   a9bf7bfd        stp     x29, x30, [sp,#-16]!
> > > 4:   910003fd        mov     x29, sp
> > > 8:   2a0003e0        mov     w0, w0
> > > c:   94000000        bl      0 <sys_creat>
> > > 10:  a8c17bfd        ldp     x29, x30, [sp],#16
> > > 14:  d65f03c0        ret
> > 
> > I would say the above could be more expensive than 8 movs (16 bytes to
> > write, read, a branch and a ret). You can also add the I-cache locality,
> > having wrappers for each syscalls instead of a single place for zeroing
> > the upper half (where no other wrapper is necessary).
> > 
> > Can we trick the compiler into doing a tail call optimisation. This
> > could have simply been:
> > 
> > COMPAT_SYSCALL_WRAP2(creat, ...):
> > 	mov	w0, w0
> > 	b	<sys_creat>
> 
> What you talk about was in my initial version. But Heiko insisted on having all
> wrappers together.
> http://www.spinics.net/lists/linux-s390/msg11593.html
> 
> Grep your email for discussion.

I think Catalin's question was more about why there is even a stack frame
generated. It looks like it is not necessary. I did ask this too a couple
of months ago, when we discussed this.

> > > > Cost wise, this seems like it all cancels out in the end, but what
> > > > do I know?
> > > 
> > > I think you know something, and I also think Heiko and other s390 guys
> > > know something as well. So I'd like to listen their arguments here.

If it comes to 64 bit arguments for compat system calls: s390 also has an
x32-like ABI extension which allows user space to use full 64 bit
registers. As far as I know hardly anybody ever made use of that.

However even if that would be widely used, to me it wouldn't make sense to
add new compat system calls which allow 64 bit arguments, simply because
something like

c = (u32)a | (u64)b << 32;

can be done with a single 1-cycle instruction. It's just not worth the
extra effort to maintain additional system call variants.




More information about the linux-arm-kernel mailing list