[PATCH 2/2] socket: add fallback for nl_connect() by trying to bind to unspecified local port

Thomas Haller thaller at redhat.com
Fri Jul 10 05:58:51 PDT 2015


libnl allows the user to explicitly set the local port before connecting
the socket. A more convenient way is to leave the local port unspecified
and let libnl generate a port id.

As it is, generate_local_port() would try at most 1024 ports, that
means if a user tries to connect more sockets, the automatism will
fail.

Kernel also supports choosing the local port itself (via netlink_autobind()).
So, this could be fixed by always leaving the port unspecified and let
kernel decide on the port. For that we could entirely drop generate_local_port().

There are however problems with that:

  - it is unclear why generate_local_port() was even introduced in the
    first place instead of always relying to kernel.
    This code already appeared in libnl-1, so maybe there was a good
    reason for it or it is necessary on some kernel versions.

  - The deprecated libnl-1 library also uses a form of generate_local_port().
    It's first guess would always be getpid(), but the problem is that
    it would not retry on EADDRINUSE. Currently libnl-3 generates ports in
    a different sequence and will not generate a conflicting port (until it
    already exhausted 1016 other ports).
    Hence, currently if your application uses libnl1 and libnl3
    together, the automatism might just work without conflicts.
    Accidently, kernel/netlink_autobind() also first tries the process
    id.
    That means, if we change libnl-3 to leave the decision to
    kernel, and
      - the application connects sockets both via libnl-1 and libnl-3
      - and the libnl-3 socket happens to connect first
    then the libnl-1 socket would fail to connect.

  - Removing generate_local_port() entirely changes behavior in the
    following case:

        sk = nl_socket_alloc();
        /* accessing local port before connecting the socket used to
         * freeze the local port to the generated value. */
        port = nl_socket_get_local_port(sk);
        nl_connect(sk, NETLINK_...);

Maybe the issues are minor and it would simplify the code just to get
rid of the cruft. But instead fix the issue without changing behavior.
Just keep trying with generate_local_port() first, before fallback to
kernel.

Reported-by: Julien Courtat <julien.courtat at 6wind.com>
Signed-off-by: Thomas Haller <thaller at redhat.com>

http://lists.infradead.org/pipermail/libnl/2015-June/001889.html
---
 lib/nl.c     | 29 ++++++++++++++++++-----------
 lib/socket.c |  3 ---
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/lib/nl.c b/lib/nl.c
index 8c6d02f..e7ccfae 100644
--- a/lib/nl.c
+++ b/lib/nl.c
@@ -105,6 +105,7 @@ int nl_connect(struct nl_sock *sk, int protocol)
 	socklen_t addrlen;
 	struct sockaddr_nl local = { 0 };
 	char buf[64];
+	int try_bind = 1;
 
 #ifdef SOCK_CLOEXEC
 	flags |= SOCK_CLOEXEC;
@@ -129,20 +130,26 @@ int nl_connect(struct nl_sock *sk, int protocol)
 	if (_nl_socket_is_local_port_unspecified (sk)) {
 		uint32_t port;
 		uint32_t used_ports[32] = { 0 };
+		int ntries = 0;
 
 		while (1) {
+			if (ntries++ > 5) {
+				/* try only a few times. We hit this only if many ports are already in
+				 * use but allocated *outside* libnl/generate_local_port(). */
+				nl_socket_set_local_port (sk, 0);
+				break;
+			}
+
 			port = _nl_socket_generate_local_port_no_release(sk);
+			if (port == 0)
+				break;
 
-			if (port == 0) {
-				NL_DBG(4, "nl_connect(%p): no more unused local ports.\n", sk);
-				_nl_socket_used_ports_release_all(used_ports);
-				err = -NLE_EXIST;
-				goto errout;
-			}
 			err = bind(sk->s_fd, (struct sockaddr*) &sk->s_local,
 				   sizeof(sk->s_local));
-			if (err == 0)
+			if (err == 0) {
+				try_bind = 0;
 				break;
+			}
 
 			errsv = errno;
 			if (errsv == EADDRINUSE) {
@@ -157,7 +164,8 @@ int nl_connect(struct nl_sock *sk, int protocol)
 			}
 		}
 		_nl_socket_used_ports_release_all(used_ports);
-	} else {
+	}
+	if (try_bind) {
 		err = bind(sk->s_fd, (struct sockaddr*) &sk->s_local,
 			   sizeof(sk->s_local));
 		if (err != 0) {
@@ -190,9 +198,8 @@ int nl_connect(struct nl_sock *sk, int protocol)
 	}
 
 	if (sk->s_local.nl_pid != local.nl_pid) {
-		/* strange, the port id is not as expected. Set the local
-		 * port id to release a possibly generated port and un-own
-		 * it. */
+		/* The port id is different. That can happen if the port id was zero
+		 * and kernel assigned a local port. */
 		nl_socket_set_local_port (sk, local.nl_pid);
 	}
 	sk->s_local = local;
diff --git a/lib/socket.c b/lib/socket.c
index 32be42e..d0fabe4 100644
--- a/lib/socket.c
+++ b/lib/socket.c
@@ -114,9 +114,6 @@ static uint32_t generate_local_port(void)
 	}
 
 	nl_write_unlock(&port_map_lock);
-
-	/* Out of sockets in our own PID namespace, what to do? FIXME */
-	NL_DBG(1, "Warning: Ran out of unique local port namespace\n");
 	return 0;
 }
 
-- 
2.4.3




More information about the libnl mailing list