[RFC PATCH 0/4] net/io_uring: pass a kernel pointer via optlen_t to proto[_ops].getsockopt()
Linus Torvalds
torvalds at linux-foundation.org
Tue Apr 1 17:40:19 PDT 2025
"
On Mon, 31 Mar 2025 at 13:11, Stefan Metzmacher <metze at samba.org> wrote:
>
> But as Linus don't like 'sockptr_t' I used a different approach.
So the sockptr_t thing has already happened. I hate it, and I think
it's ugly as hell, but it is what it is.
I think it's a complete hack and having that "kernel or user" pointer
flag is disgusting.
Making things worse, the naming is disgusting too, talking about some
random "socket pointer", when it has absolutely nothing to do with
socket, and isn't even a pointer. It's something else.
It's literally called "socket" not because it has anything to do with
sockets, but because it's a socket-specific hack that isn't acceptable
anywhere else in the kernel.
So that "socket" part of the name is literally shorthand for "only
sockets are disgusting enough to use this, and nobody else should ever
touch this crap".
At least so far that part has mostly worked, even if there's some
"sockptr_t" use in the crypto code. I didn't look closer, because I
didn't want to lose my lunch.
I don't understand why the networking code uses that thing.
If you have a "fat pointer", you should damn well make it have the
size of the area too, and do things *right*.
Instead of doing what sockptr_t does, which is a complete hack to just
pass a kernel/user flag, and then passes the length *separately*
because the socket code couldn't be arsed to do the right thing.
So I do still think "sockptr_t" should die.
As Stanislav says, if you actually want that "user or kernel" thing,
just use an "iov_iter".
No, an "iov_iter" isn't exactly a pretty thing either, but at least
it's the standard way to say "this pointer can have multiple different
kinds of sources".
And it keeps the size of the thing it points to around, so it's at
least a fat pointer with proper ranges, even if it isn't exactly "type
safe" (yes, it's type safe in the sense that it stays as a "iov_iter",
but it's still basically a "random pointer").
> @Linus, would that optlen_t approach fit better for you?
The optlen_t thing is slightly better mainly because it's more
type-safe. At least it's not a "random misnamed
user-or-kernel-pointer" thing where the name is about how nothing else
is so broken as to use it.
So it's better because it's more limited, and it's better in that at
least it has a type-safe pointer rather than a "void *" with no size
or type associated with it.
That said, I don't think it's exactly great.
It's just another case of "networking can't just do it right, and uses
a random hack with special flag values".
So I do think that it would be better to actually get rid of
"sockptr_t optval, unsigned int optlen" ENTIRELY, and replace that
with iov_iter and just make networking bite the bullet and do the
RightThing(tm).
In fact, to make it *really* typesafe, it might be a good idea to wrap
the iov_iter in another struct, something like
typedef struct sockopt {
struct iov_iter iter;
} sockopt_t;
and make the networking functions make the typing very clear, and end
up with an interface something like
int do_tcp_setsockopt(struct sock *sk,
int level, int optname,
sockopt_t *val);
where that "sockopt_t *val" replaces not just the "sockptr_t optval",
but also the "unsigned int optlen" thing.
And no, I didn't look at how much churn that would be. Probably a lot.
Maybe more than people are willing to do - even if I think some of it
could be automated with coccinelle or whatever.
Linus
More information about the linux-afs
mailing list