On Fri, Aug 02, 2013 at 10:50:44PM +0530, Vipul Singhania wrote: > On Fri, Aug 2, 2013 at 6:30 PM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote: > > On Fri, Aug 02, 2013 at 02:24:07PM +0530, Vipul Singhania wrote: > >> On Thu, Aug 1, 2013 at 5:18 PM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote: > >> > On Thu, Aug 01, 2013 at 03:52:50PM +0530, Vipul Singhania wrote: > >> >> On Wed, Jul 31, 2013 at 6:41 PM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote: > >> >> > On Wed, Jul 31, 2013 at 10:33:50AM +0530, Vipul Singhania wrote: > >> >> >> Thanks for reply. > >> >> >> > >> >> >> There is no firewall in that network. This is just separate network. > >> >> >> and I can say they are directly connected to each other using L1 > >> >> >> switch and no other connection to outside world. > >> >> >> > >> >> >> It was jut testing that I have giving public IP to one of interface in one host. > >> >> >> > >> >> >> - The association look like with public IP. > >> >> >> > >> >> >> sh-3.2# cat /proc/net/sctp/assocs > >> >> >> ASSOC SOCK STY SST ST HBKT ASSOC-ID TX_QUEUE RX_QUEUE UID INODE > >> >> >> LPORT RPORT LADDRS <-> RADDRS HBINT INS OUTS MAXRT T1X T2X RTXC > >> >> >> ffff8800089b0000 ffff8800335944c0 2 1 3 37916 3 516 > >> >> >> 0 0 10635 48520 7168 127.3.253.1 127.3.21.1 127.4.253.1 > >> >> >> 127.2.253.1 127.1.221.1 164.48.1.1 127.3.254.1 <-> *127.4.252.1 > >> >> >> 7500 300 300 10 0 0 0 > >> >> >> ffff8800089b2000 ffff880033594000 2 1 3 50717 4 516 > >> >> >> 0 0 10634 60890 7169 127.3.253.1 127.3.21.1 127.4.253.1 > >> >> >> 127.2.253.1 127.1.221.1 164.48.1.1 127.3.254.1 <-> *127.4.252.1 > >> >> >> 7500 300 300 10 0 0 0 > >> >> >> > >> >> >> ----------------------------------------------------------------------------- > >> >> >> - But if I give private IP (10.1.1.1) this look like. > >> >> >> > >> >> >> sh-3.2# cat /proc/net/sctp/assocs > >> >> >> ASSOC SOCK STY SST ST HBKT ASSOC-ID TX_QUEUE RX_QUEUE UID INODE > >> >> >> LPORT RPORT LADDRS <-> RADDRS HBINT INS OUTS MAXRT T1X T2X RTXC > >> >> >> ffff88003c721800 ffff8800335944c0 2 1 3 22045 2 0 > >> >> >> 0 0 5674 47434 7169 127.3.253.1 127.3.21.1 127.4.253.1 > >> >> >> 127.2.253.1 127.1.221.1 <-> *127.4.252.1 7500 300 300 10 > >> >> >> 0 0 0 > >> >> >> ffff88003c720800 ffff880033594000 2 1 3 36124 1 0 > >> >> >> 0 0 5673 58513 7168 127.3.253.1 127.3.21.1 127.4.253.1 > >> >> >> 127.2.253.1 127.1.221.1 <-> *127.4.252.1 7500 300 300 10 > >> >> >> 0 0 0 > >> >> >> > >> >> > I don't see any difference between the two environments here. How exactly are > >> >> > you 'giving' a private ip here? Are you attempting an ADDIP operation? > >> >> > > >> >> > >> >> [Vipul] -- The difference which I can is the public IP 164.48.1.1 is > >> >> there in association list. (Why & how this case in this list this I am > >> >> not able to understand). however If I assign the IP 10.1.1.1/24 to my > >> >> eth0 it doesn't come in this association. > >> >> The IP address assignment is using ifconfig. > >> >> > >> > Ok, so all your doing is specifying it on the interface, you're not explicitly > >> > binding to it in whatever program you have. > >> > > >> > >> [vipul] - Yes, At client side I am not doing explicitly bind. > >> > >> >> >> > >> >> >> - I may be wrong but is it possible that when we do bind with on IP > >> >> >> (and if multi homing is enabled) it'll build with all available > >> >> >> interfaces? > >> >> >> > >> >> > The opposite in fact. If you bind to a local address the association on that > >> >> > socket will be creating using only the bound address, if you do not bind on a > >> >> > local address (the autobind case), and multihoming is enabled, then all > >> >> > available addresses will be used. > >> >> > > >> >> [Vipul] -- If this is the case one host is working as server and I am > >> >> doing bind on that for IP 127.4.252.1 and the other host is always > >> >> acting as client and in this I just do connect(). > >> >> So Client will bind with all IP addresses, in this case I am not clear > >> >> why & how private IP is not coming for the accociation and why public > >> >> IP is coming in this association? > >> > Can you post the code that you are using to set up this connection? > >> > > >> ****** > >> #define SERVER_IP "127.4.252.1" > >> #define CLIENT_IP "127.4.253.1" > >> > >> socklen_t optlen = sizeof (optval); > >> struct sctp_event_subscribe events ; > >> .. > >> .. > >> start_socket: > >> > >> tSoc = -1 ; > >> lSoc = -1 ; /* intialize the socket description */ > >> if ( (lSoc = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP)) < 0 ) { > >> //error handling. > >> return ; > >> } > >> ... > >> ... > >> > >> /* Specify the peer end point for connection */ > >> servaddr.sin_family = AF_INET; > >> servaddr.sin_port = htons(7169); > >> inet_aton(SERVER_IP, &servaddr.sin_addr); > >> > >> optval = 0; > >> if (setsockopt(lSoc, SOL_SOCKET, SO_KEEPALIVE, &optval, optlen) < 0) { > >> //error handling. > >> } > >> > >> /* set to re-use to avoid "Address in use" on retry */ > >> optval = 1; > >> if(setsockopt(lSoc, SOL_SOCKET, SO_REUSEADDR, &optval, optlen) < 0) { > >> //error handling. > >> } > >> > >> > >> .... > >> .... > >> > >> /* bind the server address in keep listening */ > >> if (0 != bind(lSoc, (struct sockaddr *)&servaddr, sizeof(sockaddr))) { > >> // error handling. > >> close(lSoc); > >> sleep(2); > >> goto start_socket; > >> } > >> > >> if (listen(lSoc,5) == -1) { > >> close(lSoc); > >> sleep(2); > >> goto start_socket ; /* close the current connection and wait for new */ > >> } > >> ... > >> ... > >> > >> /* SCTP Events noticiation to listen */ > >> bzero(&events,sizeof(events)); > >> events.sctp_data_io_event = 1; > >> events.sctp_association_event = 1; > >> events.sctp_address_event = 1; > >> events.sctp_send_failure_event = 1; > >> events.sctp_peer_error_event = 1; > >> events.sctp_shutdown_event = 1; > >> if(setsockopt(lSoc,IPPROTO_SCTP,SCTP_EVENTS,&events,sizeof(events)) < 0) { > >> // Error handling. > >> } > >> ... > >> ... > >> > >> while (1) { > >> start: > >> if (server) { //server case > >> if((tSoc = accept (lSoc, NULL, 0)) == -1 ) { > >> //Error handle > >> sleep(2) ; > >> goto start ; > >> } > >> } else { // client case > >> if ((connect(lSoc, (struct sockaddr *)&servaddr, > >> sizeof(sockaddr))) != 0) { > >> //Error Handling > >> } > >> } > >> > >> } > >> > >> ... > >> ... > >> after that when I do > >> > >> if((len = recvmsg(tSoc, msg, MSG_NOSIGNAL)) > 0) { > >> // Some operation. > >> } /* if (len > 0) */ > >> > >> if (len < 0) { > >> if (getsockopt(tSoc, IPPROTO_SCTP,SCTP_STATUS, &status, &socklen) == -1) { > >> HERE I GET CONNECTION RESET BY PEER > >> } > >> } > >> ************************************ > >> > >> I have following default setting related to sctp. > >> > >> h-3.2# sysctl -a |grep sctp > >> net.sctp.rto_initial = 3000 > >> net.sctp.rto_min = 1000 > >> net.sctp.rto_max = 60000 > >> net.sctp.valid_cookie_life = 600000 > >> net.sctp.max_burst = 4 > >> net.sctp.association_max_retrans = 10 > >> net.sctp.sndbuf_policy = 0 > >> net.sctp.rcvbuf_policy = 0 > >> net.sctp.path_max_retrans = 5 > >> net.sctp.max_init_retransmits = 8 > >> net.sctp.hb_interval = 30000 > >> net.sctp.cookie_preserve_enable = 1 > >> net.sctp.rto_alpha_exp_divisor = 3 > >> net.sctp.rto_beta_exp_divisor = 2 > >> net.sctp.addip_enable = 0 > >> net.sctp.prsctp_enable = 1 > >> net.sctp.sack_timeout = 200 > >> net.sctp.sctp_mem = 36171 48230 72342 > >> net.sctp.sctp_rmem = 4096 397500 1543360 > >> net.sctp.sctp_wmem = 4096 16384 1543360 > >> net.sctp.auth_enable = 0 > >> net.sctp.addip_noauth_enable = 0 > >> net.sctp.addr_scope_policy = 1 > >> net.sctp.rwnd_update_shift = 4 > >> net.sctp.max_autoclose = 8589934 > >> sh-3.2# > >> > >> ******** > >> Attaching two pcap files. > >> > >> 1. Client_localIP.pcap ---> In this file local IP (10.1.1.1) is not in > >> association. > >> 2, Client_publicIP.pcap. ---> in this 164.48.1.1 is part of association. > >> > >> > >> > >> >> I have also tries with "echo "2" > > >> >> /proc/sys/net/sctp/addr_scope_policy" which even doesn't allow the > >> >> association with private IP. for both (private & public IP) the server > >> >> receiver end receives the connection reset by peer. > >> >> > >> > Can you provide a tcpdump of this as well? > >> > Neil > >> > > >> > > >> >> > Neil > >> >> > > >> >> > >> >> > >> >> -- > >> >> -=vipsy > >> >> http://through-dlens.blogspot.in > >> >> > >> > > So, I'm somewhat confused here. I've looked at both the tcpdumps you provided > > and both traces show that ABORT chunks are generated, the exact same ABORT > > chunks. > > > > [Vipul] -- Ohh I got your point. If you are looking at starting ABORT, > that time server was not running but once IT started there is not > ABORT in between, (Please forgave me I didn't explain this in my > previous mail). And if you look into capture with publicIP there are > continue ABORT in some interval. I don;t know why this ABORT after > some data transmission. > Ah, ok, I still don't see a significant difference though. You have have identical INIT chunks in the client_localIP trace, some of which produce aborts, some of which result in successful connections. I'm not sure what the difference is, and neither of them contain your 168.* address in the init chunk (look at frames 109-112 and 113-116 for comprative examples). I would suggest looking at the snmp stats on the peer, or just enable dynamic debugging in the sctp module to get some clue as to whats going on here. > > > So I'm left wondering why you think one environment works and the other > > doesn't (unless neither environment works and I just misunderstood you > > previously). > > > > FWIW, the ABORT chunks have no causal data, which suggests they are either the > > result of the reception of an out of the blue packet, a t5 timer expiration or > > an init verification failure (I would guess the latter). What does > > /proc/net/sctp/snmp show after you get the aborts? > > > > Neil > > > >> > > I used the Bind before connect to bind only one IP and this fixes the > issue. Now I cannot see multiple IP in association and no ABORT > PACKET. > > I'll be able to update with snmp output after a while as I need revert > the code again to get this. > > (If you think this is worth to discusses why ABORT is coming I'll > really like to continue this). > > PS: Really thanks a lot to all team for helping me and guiding me. > Don't thank me yet, I'm going on vacation here for a week or so. > > >> > >> -- > >> -=vipsy > >> http://through-dlens.blogspot.in > > > > > > > > -- > -=vipsy > http://through-dlens.blogspot.in > -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html