rxrpc kernel sockets hold additional reference to dst
Vadim Fedorenko
vfedorenko at novek.ru
Wed Jan 27 22:47:01 EST 2021
Hi!
I found a root cause of old syzkaller bug
https://syzkaller.appspot.com/bug?id=949ecf93b67ab1df8f890571d24ef9db50872c96
RXRPC sockets are based on UDP sockets. That's why __udp4_lib_rcv sets
sk->sk_rx_dst taking reference for such sockets, but rxrpc_sock_destructor never
releases this reference. But simple adding dst_release(sk->sk_rx_dst) to
rxrpc_sock_destructor doesn't help in case when namespace of rxrpc socket is
going to be destroyed. This happens because the order of ops_free is such that
netdevices are destroyed before kernel sockets. And there comes deadlock:
rxrpc socket holds a reference to dst_entry which holds reference to the device
in namespace. So ops_free cannot destroy all the netdevices in namespace, but
rxrpc socket waits for next ops_free operation which will be executed after
netdevices destroy.
My solution to change exit operation of rxrpc to pre-exit is not working well,
so I need an advise on how to deal with this deadlock.
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 0a2f481..8f50238 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -833,10 +842,16 @@ static void rxrpc_sock_destructor(struct sock *sk)
_enter("%p", sk);
rxrpc_purge_queue(&sk->sk_receive_queue);
+ dst_release(sk->sk_rx_dst);
WARN_ON(refcount_read(&sk->sk_wmem_alloc));
WARN_ON(!sk_unhashed(sk));
diff --git a/net/rxrpc/net_ns.c b/net/rxrpc/net_ns.c
index 25bbc4c..9284d82 100644
--- a/net/rxrpc/net_ns.c
+++ b/net/rxrpc/net_ns.c
@@ -108,10 +108,11 @@ static __net_init int rxrpc_init_net(struct net *net)
/*
* Clean up a per-network namespace record.
*/
-static __net_exit void rxrpc_exit_net(struct net *net)
+static __net_exit void rxrpc_pre_exit_net(struct net *net)
{
struct rxrpc_net *rxnet = rxrpc_net(net);
@@ -124,7 +125,7 @@ static __net_exit void rxrpc_exit_net(struct net *net)
struct pernet_operations rxrpc_net_ops = {
.init = rxrpc_init_net,
- .exit = rxrpc_exit_net,
+ .pre_exit = rxrpc_pre_exit_net,
.id = &rxrpc_net_id,
.size = sizeof(struct rxrpc_net),
};
More information about the linux-afs
mailing list