On 09/18/2014 07:36 AM, Jamal Hadi Salim wrote: > On 09/16/14 17:27, Vlad Yasevich wrote: > >> Hi Jamal >> >> I finally was able to dig into this issue some more and here >> is what I've found. >> > > Thanks for looking Vlad. > >> I am not sure why you say above the recvmsg() failed for you, but >> for me it worked correctly.There was a notification sitting >> on the socket queue and select() call would trigger when called. >> > > Didnt work for me - wonder if libev is expecting something of me. > The sample posted code: If you do a recvmsg on the timer callback, > that would be close to what i did. > >> So for the reproducer you've provided we have 2 leaks: >> 1) For every failed connect, you will have a notification sitting >> on the socket queue. The more connects failed, the more notifications >> you'll have. >> 2) Every notification holds a reference on the association that generated >> it. As long as notifications are queued, the old associations will >> remain in memory. >> > > nod. > >> What makes the above condition really bad is that notifications don't appear >> to be checked against the socket receive buffer or the sctp_rmem variables. >> As such, you can very easily exhaust memory by generating a ton of notifications. > > Yes. It would have been helpful debugging this if it got tied to the > process. > >> I am working on the patch to fix both of the above issues. We will not >> be able too much about queuing notifications, but we'll at least be able >> to limit them to either socket receive buffer or sctp_rmem whichever is >> smaller. If you have a program that just calls connect in a loop with >> notifications enabled, the app will eventually run out of receive buffer space >> if it doesn't drain the notifications. > > I think that is reasonable. Only need to solve the mystery of why i saw > nothing on recvmsg. > Is it possible to emulate what TCP does? > e.g associate related connects instead of creating new associations? What I was seeing in the logs is that the associate was actually failing before a new connect was generated. I tweaked your application to get rid of libev and just run in a while loop with very short select timeout. If I make is short enough then I can get in-progress error every now and then. -vlad > That way very little memory is used and i dont get "in progress" code every time when that > last connect just failed. > >> As for associations, we can drop the reference from the notification thus >> allowing the memory for the association to actually go away. > > Ok. > > cheers, > jamal > >> -vlad >> >>> >>> If you run this long enough(24 hours or so) you will see the oom >>> killer come in upset about sctp_association_new(): >>> >>> --- >>> Call Trace: >>> [<ffffffff80145508>] show_stack+0x68/0x80 >>> [<ffffffff8061e9c8>] dump_header.isra.12+0x78/0x1ac >>> [<ffffffff801d2358>] oom_kill_process+0x2e8/0x440 >>> [<ffffffff801d2998>] out_of_memory+0x2b8/0x2e8 >>> [<ffffffff801d7084>] __alloc_pages_nodemask+0x774/0x788 >>> [<ffffffff80210c60>] cache_alloc_refill+0x470/0x7b0 >>> [<ffffffff802107c4>] kmem_cache_alloc+0xe4/0x110 >>> [<ffffffffc008a214>] sctp_association_new+0x54/0x688 [sctp] >>> [<ffffffffc009c92c>] __sctp_connect+0x274/0x618 [sctp] >>> [<ffffffffc009ce84>] sctp_connect+0x7c/0xe8 [sctp] >>> [<ffffffff8053d030>] SyS_connect+0xd8/0xf8 >>> [<ffffffff8014a0a4>] handle_sys64+0x44/0x68 >>> ----- >>> >>> I am sorry I dont have time to chase the kernel code >>> (and will have to work around it in user space in our code). >>> >>> Longer version: >>> ============== >>> >>> Attached program initially tries to connect to a server which is not up >>> yet. At some point the server comes up and all the issues i observe >>> go away i.e resulting memory consumption goes to zero. >>> >>> The issue i am about to describe happens on all kernel versions i have >>> tested on (including latest and all the way back to 2.6.32 running on >>> a MIPS board). >>> >>> How to observe the issue: >>> on xterm 1: >>> sudo watch "cat /proc/slabinfo | grep -i ^kmalloc-" >>> >>> on xterm 2: >>> run the attached program. >>> >>> In my laptop the pages are 4K, so i would see kmalloc-4096 consumption >>> going up. >>> >>> If you want actually to narrow this down - then compile the kernel with >>> CONFIG_SCTP_DBG_OBJCNT (or you can believe what i am saying below). >>> do a: >>> >>> ---- >>> Every 2.0s: sudo cat /proc/net/sctp/sctp_dbg_objcnt Fri Aug 29 >>> 11:34:35 2014 >>> sock: 5 >>> ep: 5 >>> assoc: 279 >>> transport: 1 >>> chunk: 0 >>> bind_addr: 0 >>> bind_bucket: 3 >>> addr: 4 >>> ssnmap: 0 >>> datamsg: 0 >>> ------ >>> >>> And >>> >>> When i start the server 3-4 minutes later and the two ends talk to each >>> other, >>> the counters go down: >>> >>> --- >>> Every 2.0s: sudo cat /proc/net/sctp/sctp_dbg_objcnt Fri Aug 29 >>> 11:37:38 2014 >>> sock: 12 >>> ep: 12 >>> assoc: 6 >>> transport: 6 >>> chunk: 0 >>> bind_addr: 0 >>> bind_bucket: 7 >>> addr: 16 >>> ssnmap: 6 >>> datamsg: 0 >>> ------------- >>> >>> cheers, >>> jamal >>> >>> >>> >> >> >> > -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html