ipv4: fix race in concurrent ip_route_input_slow()

Linux-MTD Mailing List linux-mtd at lists.infradead.org
Fri Nov 22 17:59:05 EST 2013


Gitweb:     http://git.infradead.org/?p=mtd-2.6.git;a=commit;h=dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4
Commit:     dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4
Parent:     4f837c3b117c4fdae72d901034d0565d16af7966
Author:     Alexei Starovoitov <ast at plumgrid.com>
AuthorDate: Tue Nov 19 19:12:34 2013 -0800
Committer:  David S. Miller <davem at davemloft.net>
CommitDate: Wed Nov 20 15:28:44 2013 -0500

    ipv4: fix race in concurrent ip_route_input_slow()
    
    CPUs can ask for local route via ip_route_input_noref() concurrently.
    if nh_rth_input is not cached yet, CPUs will proceed to allocate
    equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
    via rt_cache_route()
    Most of the time they succeed, but on occasion the following two lines:
    	orig = *p;
    	prev = cmpxchg(p, orig, rt);
    in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
    But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
    so dst is leaking. dst_destroy() is never called and 'lo' device
    refcnt doesn't go to zero, which can be seen in the logs as:
    	unregister_netdevice: waiting for lo to become free. Usage count = 1
    Adding mdelay() between above two lines makes it easily reproducible.
    Fix it similar to nh_pcpu_rth_output case.
    
    Fixes: d2d68ba9fe8b ("ipv4: Cache input routes in fib_info nexthops.")
    Signed-off-by: Alexei Starovoitov <ast at plumgrid.com>
    Signed-off-by: David S. Miller <davem at davemloft.net>
---
 net/ipv4/route.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f428935..f8da282 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1776,8 +1776,12 @@ local_input:
 		rth->dst.error= -err;
 		rth->rt_flags 	&= ~RTCF_LOCAL;
 	}
-	if (do_cache)
-		rt_cache_route(&FIB_RES_NH(res), rth);
+	if (do_cache) {
+		if (unlikely(!rt_cache_route(&FIB_RES_NH(res), rth))) {
+			rth->dst.flags |= DST_NOCACHE;
+			rt_add_uncached_list(rth);
+		}
+	}
 	skb_dst_set(skb, &rth->dst);
 	err = 0;
 	goto out;



More information about the linux-mtd-cvs mailing list