[PATCH 1/4] net: skb_orphan on dev_hard_start_xmit

Eric Dumazet eric.dumazet at gmail.com
Thu Jun 4 00:54:24 EDT 2009

David Miller a écrit :
> From: Rusty Russell <rusty at rustcorp.com.au>
> Date: Thu, 4 Jun 2009 13:24:57 +0930
>> On Thu, 4 Jun 2009 06:32:53 am Eric Dumazet wrote:
>>> Also, taking a reference on socket for each xmit packet in flight is very
>>> expensive, since it slows down receiver in __udp4_lib_lookup(). Several
>>> cpus are fighting for sk->refcnt cache line.
>> Now we have decent dynamic per-cpu, we can finally implement bigrefs.  More 
>> obvious for device counts than sockets, but perhaps applicable here as well?
> It might be very beneficial for longer lasting, active, connections, but
> for high connection rates it's going to be a lose in my estimation.


We also can avoid the sock_put()/sock_hold() pair for each tx packet,
to only touch sk_wmem_alloc (with appropriate atomic_sub_return() in sock_wfree()
and atomic_dec_test in sk_free

We could initialize sk->sk_wmem_alloc to one instead of 0, so that
sock_wfree() could just synchronize itself with sk_free()

void sk_free(struct sock *sk)
	if (atomic_dec_test(&sk->sk_wmem_alloc))

 static inline void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
-       sock_hold(sk);
        skb->sk = sk;
        skb->destructor = sock_wfree;
        atomic_add(skb->truesize, &sk->sk_wmem_alloc);

 void sock_wfree(struct sk_buff *skb)
        struct sock *sk = skb->sk;
+       int res;

        /* In case it might be waiting for more memory. */
-       atomic_sub(skb->truesize, &sk->sk_wmem_alloc);
+       res = atomic_sub_return(skb->truesize, &sk->sk_wmem_alloc);
        if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE))
-       sock_put(sk);
+       if (res == 0)
+               __sk_free(sk);

Patch will follow after some testing

