rxrpc kernel sockets hold additional reference to dst

Vadim Fedorenko vfedorenko at novek.ru
Thu Jan 28 19:07:44 EST 2021


On 28.01.2021 21:11, David Howells wrote:
> 
> So afs_open_socket() tries to bind a socket to port AFS_CM_PORT (7001),
> address 0 initially, but if that fails, it will try instead port 0, address 0,
> in an attempt to ask the UDP socket to select a port.
> 
> This is most likely to happen if a new net namespace gets created since
> there's normally one port used per namespace - userspace tools, however, may
> also use an AF_RXRPC socket, though they wouldn't normally bind it (though a
> server would).

Yes, that excatly happens in a new namespace.

>> I'm not sure how to deal with early demux in this case. I think we should add
>> more restrictions to  __udp4_lib_demux_lookup but I'm not sure which exactly.
>> Do you have any suggestions?
> 
> Is there a way to allocate an unused port from UDP?

I'm not sure that it's possible in terms of binding to listen, no matter UDP or 
TCP. I think server side should always know the port it wants to listen.

> Alternatively, should rxrpc discard the dst attached to the skb?  Having it
> available could be useful.  One thing I'd like to do at some point is cache
> the dst in an rxrpc_connection struct so that I can skip the address lookup
> when doing sendmsg.

Discarding sk->sk_rx_dst on every rxrpc_input_packet works in this special test 
but could lead to strange effects in case of concurrent receiving.

If you want to cache dst for sendmsg I think you should consider separating 
registration of subsystem and register network (udp socket) part with 
register_pernet_device() and add releasing dst references in socket desctructor. 
In this case socket destructor will be called together with all the devices in 
namespace and will not lead to dead lock.

For now I think the easiest way is to skip sockets with encap_type set to 
UDP_ENCAP_RXRPC in early_demux.

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 9eeebd4..e8f7e28 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2484,8 +2484,12 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,

         udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
                 if (INET_MATCH(sk, net, acookie, rmt_addr,
-                              loc_addr, ports, dif, sdif))
+                              loc_addr, ports, dif, sdif)) {
+                       if (unlikely(udp_sk(sk)->encap_type == UDP_ENCAP_RXRPC))
+                               continue;
+
                         return sk;
+               }
                 /* Only check first socket in chain */
                 break;
         }
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 29d9691..1a88519 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1018,8 +1018,12 @@ static struct sock *__udp6_lib_demux_lookup(struct net *net,

         udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
                 if (sk->sk_state == TCP_ESTABLISHED &&
-                   INET6_MATCH(sk, net, rmt_addr, loc_addr, ports, dif, sdif))
+                   INET6_MATCH(sk, net, rmt_addr, loc_addr, ports, dif, sdif)) {
+                       if (unlikely(udp_sk(usk)->encap_type == UDP_ENCAP_RXRPC))
+                               continue;
+
                         return sk;
+               }
                 /* Only check first socket in chain */
                 break;
         }


Vadim




More information about the linux-afs mailing list