Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup" (xfrm) in 3.5+ kernels

Eric Dumazet <eric.dumazet@xxxxxxxxx> · Sun, 21 Oct 2012 23:47:33 +0200

On Mon, 2012-10-22 at 01:51 +0600, Mike Kazantsev wrote:
> On Mon, 22 Oct 2012 00:43:32 +0600
> Mike Kazantsev <mk.fraggod@xxxxxxxxx> wrote:
> 
> > > On Sun, 21 Oct 2012 15:29:43 +0200
> > > Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> > > 
> > > > 
> > > > Did you try linux-3.7-rc2 (or linux-3.7-rc1) ?
> > > > 
> > 
> > I just built "torvalds/linux-2.6" (v3.7-rc2) and rebooted into it,
> > started same rsync-over-net test and got kmalloc-64 leaking (it went up
> > to tens of MiB until I stopped rsync, normally these are fixed at ~500
> > KiB).
> > 
> > Unfortunately, I forgot to add slub_debug option and build kmemleak so
> > wasn't able to look at this case further, and when I rebooted with
> > these enabled/built, it was secpath_cache again.
> > 
> > So previously noted "slabtop showed 'kmalloc-64' being the 99% offender
> > in the past, but with recent kernels (3.6.1), it has changed to
> > 'secpath_cache'" seem to be incorrect, as it seem to depend not on
> > kernel version, but some other factor.
> > 
> > Guess I'll try to reboot a few more times to see if I can catch
> > kmalloc-64 leaking (instead of secpath_cache) again.
> > 
> 
> I haven't been able to catch the aforementioned condition, but noticed
> that with v3.7-rc2, "hex dump" part seem to vary in kmemleak
> traces, and contain all sorts of random stuff, for example:
> 
> unreferenced object 0xffff88002ae2de00 (size 56):
>   comm "softirq", pid 0, jiffies 4295006317 (age 213.066s)
>   hex dump (first 32 bytes):
>     01 00 00 00 01 00 00 00 20 9f f4 28 00 88 ff ff  ........ ..(....
>     2f 6f 72 67 2f 66 72 65 65 64 65 73 6b 74 6f 70  /org/freedesktop
>   backtrace:
>     [<ffffffff814da4e3>] kmemleak_alloc+0x21/0x3e
>     [<ffffffff810dc1f7>] kmem_cache_alloc+0xa5/0xb1
>     [<ffffffff81487bf1>] secpath_dup+0x1b/0x5a
>     [<ffffffff81487df5>] xfrm_input+0x64/0x484
>     [<ffffffff814bbd70>] xfrm6_rcv_spi+0x19/0x1b
>     [<ffffffff814bbd92>] xfrm6_rcv+0x20/0x22
>     [<ffffffff814960c3>] ip6_input_finish+0x203/0x31b
>     [<ffffffff81496542>] ip6_input+0x1e/0x50
>     [<ffffffff81496240>] ip6_rcv_finish+0x65/0x69
>     [<ffffffff814964c3>] ipv6_rcv+0x27f/0x2e0
>     [<ffffffff8140a659>] __netif_receive_skb+0x5ba/0x65a
>     [<ffffffff8140a894>] netif_receive_skb+0x47/0x78
>     [<ffffffff8140b4bf>] napi_skb_finish+0x21/0x54
>     [<ffffffff8140b5ef>] napi_gro_receive+0xfd/0x10a
>     [<ffffffff81372b47>] rtl8169_poll+0x326/0x4fc
>     [<ffffffff8140ad44>] net_rx_action+0x9f/0x188
> 
> Not sure if it's relevant though.
> 
> 

OK, so  some layer seems to have a bug if the skb->head is exactly
allocated, instead of having extra tailroom (because of kmalloc-powerof2
alignment)

Or some layer overwrites past skb->cb[] array

If you try to move sp field in sk_buff, does it change something ?

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6a2c34e..9b1438a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -395,6 +395,9 @@ struct sk_buff {
 	struct sock		*sk;
 	struct net_device	*dev;
 
+#ifdef CONFIG_XFRM
+	struct	sec_path	*sp;
+#endif
 	/*
 	 * This is the control buffer. It is free to use for every
 	 * layer. Please put your private variables there. If you
@@ -404,9 +407,6 @@ struct sk_buff {
 	char			cb[48] __aligned(8);
 
 	unsigned long		_skb_refdst;
-#ifdef CONFIG_XFRM
-	struct	sec_path	*sp;
-#endif
 	unsigned int		len,
 				data_len;
 	__u16			mac_len,




Also try to increase tailroom in __netdev_alloc_skb()

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..972ee4f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -427,7 +427,7 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 				   unsigned int length, gfp_t gfp_mask)
 {
 	struct sk_buff *skb = NULL;
-	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
+	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD + 64) +
 			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 
 	if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) {


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>