Re: Out of memory + skbuff_head_cache

Ramesh <rramesh1@xxxxxxxxx> · Wed, 15 Apr 2009 09:03:40 -0700

On Wed, Apr 15, 2009 at 3:42 AM, Tekale Sharad-FHJN78
<FHJN78@xxxxxxxxxxxx> wrote:
>
> Hi Ramesh,
>
>
>>It was an AP too in our case, we were pumping data from wired to
> wireless and vice-versa when we got
>>into this situation. In our case we used printk and we wrote some dump
> routines for eg printing
>>the skb_headroom and skb_tailroom and true.  In our case another module
> changed the headroom for some skbs
>>and those skbs were never actually getting freed.
>
> Thanks for hint, But...
>
> I have one more query,
>
> In our case, Instead of pumping huge amount of traffic at once, when we
> just start a single stream of chariot,
> then there seems to be no memory leak. So, Now I'm unable to understand
> is it really a leak or something else.
>
> The below link states the following...
>        "The problem seems to be related to the rate by which memory
> management
>        functions like alloc_skb/kfree_skb in this case are called."
> http://ozlabs.org/pipermail/linuxppc-embedded/2004-March/013667.html
>
> The work around they mention is "either having static skb pools or
> stopping interrupts during congestion."
>
> Did any one implemented static skb pools, if so, can any one guide me,
> How I can achieve the same or refer to some document or any link, that
> talks about same?
>
> Any help highly appreciated.
>
> Thanks,
> Sharad.
>
>
>
>
> -----Original Message-----
> From: Ramesh [mailto:rramesh1@xxxxxxxxx]
> Sent: Tuesday, April 14, 2009 9:44 AM
> To: Tekale Sharad-FHJN78
> Cc: kernelnewbies@xxxxxxxxxxxx
> Subject: Re: Out of memory + skbuff_head_cache
>
> Tekale Sharad-FHJN78 wrote:
>> Hi Ramesh,
>>
>> We are using chariot and iperf tools to pump traffic.
>>
>>
>>> Also which ethernet driver are you using?
>>>
>> I'm using marvell switch driver on Openwrt.
>>
>>
>>> but we used bonding driver which wasn't allocating enough memory for
>>>
>> some fragments causing a leak and subsequently a
>>
>>> panic at a later stage.
>>>
>> Although we are not using bonding driver, but my case is similar to
>> yours, i.e. skbuff_head_cache goes on increasing and at threshold
>> value, kernel crashes.
>>
> [R] It was an AP too in our case, we were pumping data from wired to
> wireless and vice-versa when we got into this situation. In our case we
> used printk and we wrote some dump routines for eg printing the
> skb_headroom and skb_tailroom and true.  In our case another module
> changed the headroom for some skbs and those skbs were never actually
> getting freed.
>>
>>> You might want to try dumping & match the allocated size for each skb
>
>>> &
>>>
>> match with the corresponding free.
>> Do you mean to use printk or is there any other way for dumping???
>>
> [R]
>> Thanks,
>> Sharad.
>>
>>
>> -----Original Message-----
>> From: Ramesh [mailto:rramesh1@xxxxxxxxx]
>> Sent: Tuesday, April 14, 2009 12:01 AM
>> To: Tekale Sharad-FHJN78
>> Cc: kernelnewbies@xxxxxxxxxxxx
>> Subject: Re: Out of memory + skbuff_head_cache
>>
>> Tekale Sharad-FHJN78 wrote:
>>
>>> Hi,
>>>
>>> I'm using linux 2.6.21.5 and our kernel is freeze.
>>>
>>
>>
>>>
>>> The problem I'm facing is, when I create a software bridge using
>>> *$brctl* command. and add two interfaces say, eth0.0 and eth0.1, this
>
>>> way...
>>>
>>> $brctl addbr br-lan
>>> $brctl addif br-lan eth0.0
>>> $brctl addif br-lan eth0.1
>>>
>>> When I send traffic from a host connected to one port to host
>>> connected at other at or above end 60Mbits/sec ,  soon all the
>>> *memory
>>>
>>
>>
>>> is dried up/consumed and and system crashes.*
>>>
>>> Observation:
>>> On initial start up:
>>> $cat /proc/slabinfo | grep skbuff_head_cache
>>> skbuff_head_cache    120    120    192   20    1 : tunables  120
>>> 60    0 : slabdata      6
>>>
>>> Before crash:
>>> $cat /proc/slabinfo | grep skbuff_head_cache
>>> skbuff_head_cache   4260   4260    192   20    1 : tunables  120
>>> 60    0 : slabdata    213    213      0
>>>
>>> Can any one help me to refer to some patch or point to some location
>>> in code from where memory is failed to deallocate.
>>>
>>> Thanks,
>>> Sharad.
>>>
>>
>> Hello Sharad,
>>
>> What type of packets & packet sizes are you sending ? Also which
>> ethernet driver are you using?
>> In a problem what I faced before, we had a similar situation, but we
>> used bonding driver which wasnt allocating enough memory for some
>> fragments causing a leak and subsequently a panic at a later stage.
>>
>> You might want to try dumping & match the allocated size for each skb
>> & match with the corresponding free.
>>
>> Regards
>> Ramesh
>>
>>
>
>

Hello Sharad,

Sorry if I sounded ambigious, in my case it was a leak, that need not
be the same in your case. But problem is definitely related to either
allocation / deallocation or headroom/tailroom issue. I have no idea
on the dynamic skb pools though.

You could try to isolate that problem further:

See if the problem happens in the  Tx path or Rx path

Then in which ever path the problem happens, try to verify the actual
allocation (please pay attention to headroom / tailroom)

please check for (bug fixes) patches on the kernel version that you
are running related to the problem that you are seeing - if its
already fixed, all you need is the right patch.

And if you suspect its the rate of packets that matter, you could
stall either tx / rx (wherever the rate is abnormal) temporarily and
put a limiting check in the bridge module to protect your data
structure.

Something that I know of is for eg, ping packets i have done something
in an older stack is to limit responses to 200 packets a sec when the
incoming echo request rate is very high. You could tune that according
to your available skbs and determine the limiting factor dynamically.

Good luck
/Ramesh

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ