On 01/19/2016 10:59 AM, Marcelo Ricardo Leitner wrote: > Em 19-01-2016 12:19, Vlad Yasevich escreveu: >> On 01/15/2016 04:40 PM, Marcelo Ricardo Leitner wrote: >>> On Fri, Jan 15, 2016 at 08:11:03PM +0100, Dmitry Vyukov wrote: >>>> On Fri, Jan 15, 2016 at 7:46 PM, Marcelo Ricardo Leitner >>>> <marcelo.leitner@xxxxxxxxx> wrote: >>>>> On Wed, Dec 30, 2015 at 09:42:27PM +0100, Dmitry Vyukov wrote: >>>>>> Hello, >>>>>> >>>>>> The following program leads to a leak of two sock objects: >>>>> ... >>>>>> >>>>>> On commit 8513342170278468bac126640a5d2d12ffbff106 (Dec 28). >>>>> >>>>> I'm afraid I cannot reproduce this one? >>>>> I enabled dynprintk at sctp_destroy_sock and it does print twice when I >>>>> run this test app. >>>>> Also added debugs to check association lifetime, and then it was >>>>> destroyed. Same for endpoint. >>>>> >>>>> Checking with trace-cmd, both calls to sctp_close() resulted in >>>>> sctp_destroy_sock() being called. >>>>> >>>>> As for sock_hold/put, they are matched too. >>>>> >>>>> Ideas? Log is below for double checking >>>> >>>> >>>> Hummm... I can reproduce it pretty reliably. >>>> >>>> [ 197.459024] kmemleak: 11 new suspected memory leaks (see >>>> /sys/kernel/debug/kmemleak) >>>> [ 307.494874] kmemleak: 409 new suspected memory leaks (see >>>> /sys/kernel/debug/kmemleak) >>>> [ 549.784022] kmemleak: 125 new suspected memory leaks (see >>>> /sys/kernel/debug/kmemleak) >>>> >>>> I double checked via /proc/slabinfo: >>>> >>>> SCTPv6 4373 4420 2368 13 8 : tunables 0 0 >>>> 0 : slabdata 340 340 0 >>>> >>>> SCTPv6 starts with almost 0, but grows infinitely while I run the >>>> program in a loop. >>>> >>>> Here is my SCTP related configs: >>>> >>>> CONFIG_IP_SCTP=y >>>> CONFIG_NET_SCTPPROBE=y >>>> CONFIG_SCTP_DBG_OBJCNT=y >>>> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_MD5 is not set >>>> # CONFIG_SCTP_DEFAULT_COOKIE_HMAC_SHA1 is not set >>>> CONFIG_SCTP_DEFAULT_COOKIE_HMAC_NONE=y >>>> # CONFIG_SCTP_COOKIE_HMAC_MD5 is not set >>>> # CONFIG_SCTP_COOKIE_HMAC_SHA1 is not set >>>> >>>> I am on commit 67990608c8b95d2b8ccc29932376ae73d5818727 and I don't >>>> seem to have any sctp-related changes on top. >>> >>> Ok, now I can. Enabled slub debugs, now I cannot see calls to >>> sctp_destroy_sock. I see to sctp_close, but not to sctp_destroy_sock. >>> >>> And SCTPv6 grew by 2 sockets after the execution. >>> >>> Further checking, it's a race within SCTP asoc migration: >>> thread 0 thread 1 >>> - app creates a sock >>> - sends a packet to itself >>> - sctp will create an asoc and do implicit >>> handshake >>> - send the packet >>> - listen() >>> - accept() is called and >>> that asoc is migrated >>> - packet is delivered >>> - skb->destructor is called, BUT: >>> >>> (note that if accept() is called after packet is delivered and skb is freed, it >>> doesn't happen) >>> >>> static void sctp_wfree(struct sk_buff *skb) >>> { >>> struct sctp_chunk *chunk = skb_shinfo(skb)->destructor_arg; >>> struct sctp_association *asoc = chunk->asoc; >>> struct sock *sk = asoc->base.sk; >>> ... >>> atomic_sub(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc); >>> >>> and it's pointing to the new socket already. So one socket gets a leak >>> on sk_wmem_alloc and another gets a negative value: >>> >>> --- a/net/sctp/socket.c >>> +++ b/net/sctp/socket.c >>> @@ -1537,12 +1537,14 @@ static void sctp_close(struct sock *sk, long timeout) >>> /* Hold the sock, since sk_common_release() will put sock_put() >>> * and we have just a little more cleanup. >>> */ >>> + printk("%s sock_hold %p\n", __func__, sk); >>> sock_hold(sk); >>> sk_common_release(sk); >>> >>> bh_unlock_sock(sk); >>> spin_unlock_bh(&net->sctp.addr_wq_lock); >>> >>> + printk("%s sock_put %p %d %d\n", __func__, sk, atomic_read(&sk->sk_refcnt), >>> atomic_read(&sk->sk_wmem_alloc)); >>> sock_put(sk); >>> >>> SCTP_DBG_OBJCNT_DEC(sock); >>> >>> >>> gave me: >>> >>> [ 99.456944] sctp_close sock_hold ffff880137df8940 >>> ... >>> [ 99.457337] sctp_close sock_put ffff880137df8940 1 -247 >>> [ 99.458313] sctp_close sock_hold ffff880137dfef00 >>> ... >>> [ 99.458383] sctp_close sock_put ffff880137dfef00 1 249 >>> >>> That's why the socket is not freed.. >>> >> >> Interesting... sctp_sock_migrate() accounts for this race in the >> receive buffer, but not the send buffer. >> >> On the one hand I am not crazy about the connect-to-self scenario. >> On the other, I think to support it correctly, we should support >> skb migrations for the send case just like we do the receive case. > > > Yes, not thrilled here either about connect-to-self. > > But there is a big difference on how both works. For rx we can just look for wanted skbs > in rx queue, as they aren't going anywhere, but for tx I don't think we can easily block > sctp_wfree() call because that may be happening on another CPU (or am I mistaken here? > sctp still doesn't have RFS but even irqbalance could affect this AFAICT) and more than > one skb may be in transit at a time. The way it's done now, we wouldn't have to block sctp_wfree. Chunks are released under lock when they are acked, so we are OK here. The tx completions will just put 1 byte back to the socket associated with the tx'ed skb, and that should still be ok as sctp_packet_release_owner will call sk_free(). > The lockings for this on sctp_chunk would be pretty nasty, I think, and normal usage lets > say wouldn't be benefit from it. Considering the possible migration, as we can't trust > chunk->asoc right away in sctp_wfree, the lock would reside in sctp_chunk and we would > have to go on taking locks one by one on tx queue for the migration. Ugh ;) > No, the chunks manipulation is done under the socket locket so I don't think we have to worry about a per chunk lock. We should be able to trust chunk->asoc pointer always because each chunk holds a ref on the association. The only somewhat ugly thing about moving tx chunks is that you have to potentially walk a lot of lists to move things around. There are all the lists in the sctp_outqueue struct, plus the per-transport retransmit list... Even though the above seems to be a PITA, my main reason for recommending this is that can happen in normal situations too. Consider a very busy association that is transferring a lot of a data on a 1-to-many socket. The app decides to move do a peel-off, and we could now be stuck not being able to peel-off for a quite a while if there is a hick-up in the network and we have to rtx multiple times. -vlad > Marcelo > -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html