Re: Possible regression with skb_clone() in 2.6.36

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mat,

* Gustavo F. Padovan <padovan@xxxxxxxxxxxxxx> [2010-09-10 16:45:09 -0300]:

> Hi Mat,
> 
> * Mat Martineau <mathewm@xxxxxxxxxxxxxx> [2010-09-10 09:53:31 -0700]:
> 
> > 
> > Gustavo -
> > 
> > I'm not sure why the streaming code used to work, but this does not 
> > look like an skb_clone() problem.  Your patch to remove the 
> > skb_clone() call in l2cap_streaming_send() addresses the root cause of 
> > this crash.
> > 
> > On Wed, 8 Sep 2010, Gustavo F. Padovan wrote:
> > 
> > > I've been experiencing some problems when running the L2CAP Streaming mode in
> > > 2.6.36. The system quickly runs in an Out Of Memory condition and crash. That
> > > wasn't happening before, so I think we may have a regression here (I didn't
> > > find where yet). The crash log is below.
> > >
> > > The following patch does not fix the regression, but shows that removing the
> > > skb_clone() call from l2cap_streaming_send() we workaround the problem. The
> > > patch is good anyway because it saves memory and time.
> > >
> > > By now I have no idea on how to fix this.
> > >
> > > <snip>
> > 
> > This has to do with the sk->sk_wmem_alloc accounting that controls the 
> > amount of write buffer space used on the socket.
> > 
> > When the L2CAP streaming mode socket segments its data, it allocates 
> > memory using sock_alloc_send_skb() (via bt_skb_send_alloc()).  Before 
> > that allocation call returns, skb_set_owner_w() is called on the new 
> > skb.  This adds to sk->sk_wmem_alloc and sets skb->destructor so that 
> > sk->sk_wmem_alloc is correctly updated when the skb is freed.
> > 
> > When that skb is cloned, the clone is not "owned" by the write buffer. 
> > The clone's destructor is set to NULL in __skb_clone().  The version 
> > of l2cap_streaming_send() that runs out of memory is passing the 
> > non-owned skb clone down to the HCI layer.  The original skb (the one 
> > that's "owned by w") is immediately freed, which adjusts 
> > sk->sk_wmem_alloc back down - the socket thinks it has unlimited write 
> > buffer space.  As a result, bt_skb_send_alloc() never blocks waiting 
> > for buffer space (or returns EAGAIN for nonblocking writes) and the 
> > HCI send queue keeps growing.
> 
> If the problem is what you are saying, add a skb_set_owner_w(skb, sk) on
> the cloned skb should solve the problem, but it doesn't. That's exactly
> what tcp_transmit_skb() does. Also that just appeared in 2.6.36, is was
> working fine before, i.e, we have a regression here. 

I've run some other tests and what you said also fixes the problem for
Streaming Mode. If I use skb_set_owner_w() on the cloned skb, everything
works fine. But we still have the problem for ERTM as I described.
send() blocks wainting for memory. The regression is there yet.
:(

> 
> > 
> > This isn't a problem for the ERTM sends, because the original skbs are 
> > kept in the ERTM tx queue until they are acked.  Once they're acked, 
> > the write buffer space is freed and additional skbs can be allocated.
> 
> It affects ERTM as well, but in that case the kernel doesn't crash
> because ERTM block on sending trying to allocate memory. Then we are not
> able to receive any ack (everything stays queued in sk_backlog_queue as
> the sk is owned by the user) and ERTM stalls.
> 

-- 
Gustavo F. Padovan
ProFUSION embedded systems - http://profusion.mobi
--
To unsubscribe from this list: send the line "unsubscribe linux-bluetooth" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux