Re: Lingering associations on the server side after process dies

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 21, 2016 at 11:11:46PM -0800, amar padmanabhan wrote:
> I am trying to figure out an issue where after a process crash
> associations are lingering.
> 
> On the server:
> vagrant@magma-dev:~/build/oai_sgw$ sudo cat /proc/net/sctp/assocs
>  ASSOC     SOCK   STY SST ST HBKT ASSOC-ID TX_QUEUE RX_QUEUE UID INODE
> LPORT RPORT LADDRS <-> RADDRS HBINT INS OUTS MAXRT T1X T2X RTXC wmema
> wmemq sndbuf rcvbuf
> ffff8800b58c7000 ffff8800da4d0bc0 2   1   3  0       9        0
> 0       0     0 36412 36412  192.168.60.142 <-> *192.168.60.141
> 7500     3     8   10    0    7        0        1        0   212992
> 212992
> 
> sudo lsof | grep ffff8800da4d0bc0
> 
> after the application server restarts the client retries with INIT and
> COOKIE_ECHO and the server replies back with INIT_ACK and COOKIE_ACK
> without any notification to the application server and any subsequent
> request from the client is not seen on the server.

Sounds like the asoc leaked. It's lingering, maybe waiting for
something to complete or simply leaked.
Thus, when the client issues a new INIT, it's actually this old asoc
that is catching and replying it, and not the new server process, so it
doesn't/can't see the new request.

> 
> Some pointers would be useful
> 1. Why can't I see the socket in lsof.

Not sure.

> 2. How do I shutdown the existing association, so the server can
> rebuild the state on the associations, and restart cleanly.

We should confirm if it's really leaked, and fix the leak instead.

The proc output above contains wmema: 1, probably comes from a
sctp_packet_set_owner_w() call, indicating a sk_buff is still live and
holding the asoc.

We had similar issues in recent past but they should be fixed in the
kernel you're using. They were related to error handling situations.

Can you try to come with a minimal reproducer for this?

> 
> vagrant@magma-dev:~/build/oai_sgw$ uname -a
> Linux magma-dev 4.7.4-040704-generic #201609150330 SMP Thu Sep 15
> 07:32:22 UTC 2016 x86_64 GNU/Linux
> 
> vagrant@magma-dev:~/build/oai_sgw$ sudo modinfo sctp
> filename:       /lib/modules/4.7.4-040704-generic/kernel/net/sctp/sctp.ko
> license:        GPL
> description:    Support for the SCTP protocol (RFC2960)
> author:         Linux Kernel SCTP developers <linux-sctp@xxxxxxxxxxxxxxx>
> alias:          net-pf-10-proto-132
> alias:          net-pf-2-proto-132
> depends:        libcrc32c
> intree:         Y
> vermagic:       4.7.4-040704-generic SMP mod_unload modversions
> parm:           no_checksums:Disable checksums computing and verification (bool)
> 
> The code can be found here:
> https://gitlab.eurecom.fr/oai/openair-cn/blob/develop/SRC/SCTP/sctp_primitives_server.c#L352
> 
> Thanks
> Amar
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux