linux interprets an fcntl int arg as long

Tue Nov 1 03:02:26 PDT 2022

From: Szabolcs Nagy
> Sent: 01 November 2022 09:11
> 
> The 10/31/2022 21:46, Theodore Ts'o wrote:
> > On Mon, Oct 31, 2022 at 12:44:59PM +0000, Szabolcs Nagy wrote:
> > > and such fcntl call can happen with c code that just passes
> > > F_SEAL_WRITE since it is an int and e.g. with aarch64 pcs rules
> > > it is passed in a register where top bits can be non-zero
> > > (unlikely in practice but valid).
> >
> > In Linux's aarch64 ABI, an int is a 4-byte value.  It is *not* an
> > 8-byte value.  So passing in "F_SEAL_WRITE | 0xF00000000" as an int
> > (as in your example) is simply not valid thing for the userspace
> > program to do.
> >
> > Now, if there is a C program which has "int c = F_SEAL_WRITE", if the
> > PCS allows the compiler to pass a function paramter c --- for example
> > f(a, b, c) --- where the 4-byte paramter 'c' is placed in a 64-bit
> > register where the high bits of the 64-bit register contains non-zero
> > garbage values, I would argue that this is a bug in the PCS and/or the
> > compiler.
> 
> the callee uses va_arg(ap, type) to get the argument,
> and if the type is wider than what was actually passed
> then anything can happen. in practice what happens is
> that the top bits can be non-zero.
> 
> many pcs are affected (aarch64 is the one i know well,
> but at least x86_64, arm are affected too). and even if
> it was aarch64 pcs only, it is incompetent to say that
> the pcs is wrong: that's a constraint we are working with.
> 
> the kernel must not read a wider type than what it
> documents as argument to variadic functions in the c api.
> (it does not make much sense to expect anything there
> anyway, but it can break userspace)

The Linux kernel just assumes that the varargs call looks like
a non-varags call with the same parameters.
(It doesn't use va_arg())
All syscall arguments are passed in registers (unlike BSDs
where they can also be on the user stack).
On 64bit systems the same registers are expected to be used
for 64bit and 32bit integers and for pointers.
32bit values usually get masked because they get passed to
a function with an 'int' argument.

If any fcntl() calls require a 64bit value and the C ABI
might leave non-zero high bits in an register containing
a 32bit value (esp. to a varargs function) then the calling
code will need to cast such arguments to 64 bits.

OTOH I suspect the argument is either absent, int or pointer.
So it should mask the value to 32 bits.

Note that there are ABI where 'int' and 'pointer' get passed
in different registers.
Fortunately none will support Linux!

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)