rxrpc kernel sockets hold additional reference to dst

David Howells dhowells at redhat.com
Fri Jan 29 06:29:55 EST 2021


Vadim Fedorenko <vfedorenko at novek.ru> wrote:

> > Alternatively, should rxrpc discard the dst attached to the skb?  Having it
> > available could be useful.  One thing I'd like to do at some point is cache
> > the dst in an rxrpc_connection struct so that I can skip the address lookup
> > when doing sendmsg.
> 
> Discarding sk->sk_rx_dst on every rxrpc_input_packet works in this special
> test but could lead to strange effects in case of concurrent receiving.

I think I must be getting confused as to what the problem actually is.

I was talking about skb->_skb_refdst.  I thought you meant that that was the
ref that was leaking:

	After that in __udp4_lib_rcv skb_steal_sock completes successfully and
	skb->dst is assigned to sk->sk_rx_dst with additional reference taken
	which is never released and leaked after rxrpc socket is destroyed.

I can see, sk->sk_rx_dst is exchanged each time a packet is processed that's
from a different source.  Btw, is there a reason we're doing this on a UDP
socket where there could be packets coming in from all over the place
simultaneously?  Do we serialise processing of incoming packets from multiple
interfaces?  I'm just wondering if udp_v4_early_demux() doing:

	dst = READ_ONCE(sk->sk_rx_dst);

is actually useful.  Could it have been swapped several times since we set it?

Also, shouldn't the UDP socket clean up sk->sk_rx_dst when it closes?

> If you want to cache dst for sendmsg I think you should consider separating
> registration of subsystem and register network (udp socket) part with 
> register_pernet_device() and add releasing dst references in socket
> desctructor. In this case socket destructor will be called together with all
> the devices in namespace and will not lead to dead lock.

Yeah, I haven't really gotten around to looking deeply into this yet.  Why
would it deadlock?  Is dst information released earlier?

I would expect to be releasing cached dst records when cleaning up the
connection cache (rxrpc_destroy_all_connections()).  Currently that's called
from rxrpc_exit_net() and done before the UDP socket is closed so that I can
flush any outstanding deferred final ACKs of calls done on those
connections[*].

[*] An Rx RPC is phased as DATA packets going client->server (the request),
    DATA packets going server->client (the reply) and then the client sending
    an ACK to polish off.  However, the final ACK is implicit if we instead
    send the first DATA packet of the next call on the same connection
    channel.  This saves a packet.  So final ACK transmission is deferred by a
    small interval, just in case we can discard it.

David




More information about the linux-afs mailing list