Re: Association issue.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 02, 2013 at 10:50:44PM +0530, Vipul Singhania wrote:
> On Fri, Aug 2, 2013 at 6:30 PM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote:
> > On Fri, Aug 02, 2013 at 02:24:07PM +0530, Vipul Singhania wrote:
> >> On Thu, Aug 1, 2013 at 5:18 PM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote:
> >> > On Thu, Aug 01, 2013 at 03:52:50PM +0530, Vipul Singhania wrote:
> >> >> On Wed, Jul 31, 2013 at 6:41 PM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote:
> >> >> > On Wed, Jul 31, 2013 at 10:33:50AM +0530, Vipul Singhania wrote:
> >> >> >> Thanks for reply.
> >> >> >>
> >> >> >> There is no firewall in that network. This is just separate network.
> >> >> >> and I can say they are directly connected to each other using L1
> >> >> >> switch and no other connection to outside world.
> >> >> >>
> >> >> >> It was jut testing that I have giving public IP to one of interface in one host.
> >> >> >>
> >> >> >> - The association look like with public IP.
> >> >> >>
> >> >> >> sh-3.2# cat /proc/net/sctp/assocs
> >> >> >>  ASSOC     SOCK   STY SST ST HBKT ASSOC-ID TX_QUEUE RX_QUEUE UID INODE
> >> >> >> LPORT RPORT LADDRS <-> RADDRS HBINT INS OUTS MAXRT T1X T2X RTXC
> >> >> >> ffff8800089b0000 ffff8800335944c0 2   1   3  37916    3      516
> >> >> >>  0       0 10635 48520  7168  127.3.253.1 127.3.21.1 127.4.253.1
> >> >> >> 127.2.253.1 127.1.221.1 164.48.1.1 127.3.254.1 <-> *127.4.252.1
> >> >> >>  7500   300   300   10    0    0        0
> >> >> >> ffff8800089b2000 ffff880033594000 2   1   3  50717    4      516
> >> >> >>  0       0 10634 60890  7169  127.3.253.1 127.3.21.1 127.4.253.1
> >> >> >> 127.2.253.1 127.1.221.1 164.48.1.1 127.3.254.1 <-> *127.4.252.1
> >> >> >>  7500   300   300   10    0    0        0
> >> >> >>
> >> >> >> -----------------------------------------------------------------------------
> >> >> >> - But if I give private IP (10.1.1.1) this look like.
> >> >> >>
> >> >> >> sh-3.2# cat /proc/net/sctp/assocs
> >> >> >>  ASSOC     SOCK   STY SST ST HBKT ASSOC-ID TX_QUEUE RX_QUEUE UID INODE
> >> >> >> LPORT RPORT LADDRS <-> RADDRS HBINT INS OUTS MAXRT T1X T2X RTXC
> >> >> >> ffff88003c721800 ffff8800335944c0 2   1   3  22045    2        0
> >> >> >>  0       0  5674 47434  7169  127.3.253.1 127.3.21.1 127.4.253.1
> >> >> >> 127.2.253.1 127.1.221.1 <-> *127.4.252.1         7500   300   300   10
> >> >> >>    0    0        0
> >> >> >> ffff88003c720800 ffff880033594000 2   1   3  36124    1        0
> >> >> >>  0       0  5673 58513  7168  127.3.253.1 127.3.21.1 127.4.253.1
> >> >> >> 127.2.253.1 127.1.221.1 <-> *127.4.252.1         7500   300   300   10
> >> >> >>    0    0        0
> >> >> >>
> >> >> > I don't see any difference between the two environments here.  How exactly are
> >> >> > you 'giving' a private ip here?  Are you attempting an ADDIP operation?
> >> >> >
> >> >>
> >> >> [Vipul] -- The difference which I can is the public IP 164.48.1.1 is
> >> >> there in association list. (Why & how this case in this list this I am
> >> >> not able to understand). however If I assign the IP 10.1.1.1/24 to my
> >> >> eth0 it doesn't come in this association.
> >> >> The IP address assignment is using ifconfig.
> >> >>
> >> > Ok, so all your doing is specifying it on the interface, you're not explicitly
> >> > binding to it in whatever program you have.
> >> >
> >>
> >> [vipul] - Yes, At client side I am not doing explicitly bind.
> >>
> >> >> >>
> >> >> >> - I may be wrong but is it possible that when we do bind with on IP
> >> >> >> (and if multi homing is enabled) it'll build with all available
> >> >> >> interfaces?
> >> >> >>
> >> >> > The opposite in fact.  If you bind to a local address the association on that
> >> >> > socket will be creating using only the bound address, if you do not bind on a
> >> >> > local address (the autobind case), and multihoming is enabled, then all
> >> >> > available addresses will be used.
> >> >> >
> >> >> [Vipul] -- If this is the case one host is working as server and I am
> >> >> doing bind on that for IP 127.4.252.1 and the other host is always
> >> >> acting as client and in this I just do connect().
> >> >> So Client will bind with all IP addresses, in this case I am not clear
> >> >> why & how private IP is not coming for the accociation and why public
> >> >> IP is coming in this association?
> >> > Can you post the code that you are using to set up this connection?
> >> >
> >> ******
> >> #define SERVER_IP "127.4.252.1"
> >> #define CLIENT_IP "127.4.253.1"
> >>
> >> socklen_t optlen = sizeof (optval);
> >> struct sctp_event_subscribe   events ;
> >> ..
> >> ..
> >>  start_socket:
> >>
> >>  tSoc = -1 ;
> >>  lSoc = -1 ; /* intialize the socket description */
> >>     if (  (lSoc = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP)) < 0 ) {
> >>          //error handling.
> >>          return ;
> >>     }
> >> ...
> >> ...
> >>
> >>  /* Specify the peer end point for connection */
> >>      servaddr.sin_family = AF_INET;
> >>      servaddr.sin_port = htons(7169);
> >>      inet_aton(SERVER_IP, &servaddr.sin_addr);
> >>
> >>      optval = 0;
> >>      if (setsockopt(lSoc, SOL_SOCKET, SO_KEEPALIVE, &optval, optlen) < 0) {
> >>          //error handling.
> >>      }
> >>
> >>      /* set to re-use to avoid "Address in use" on retry */
> >>      optval = 1;
> >>      if(setsockopt(lSoc, SOL_SOCKET, SO_REUSEADDR, &optval, optlen) < 0) {
> >>          //error handling.
> >>      }
> >>
> >>
> >> ....
> >> ....
> >>
> >>     /* bind the server address in keep listening */
> >>      if (0 != bind(lSoc, (struct sockaddr *)&servaddr, sizeof(sockaddr))) {
> >>     // error handling.
> >>              close(lSoc);
> >>              sleep(2);
> >>              goto start_socket;
> >>      }
> >>
> >>     if (listen(lSoc,5) == -1) {
> >>         close(lSoc);
> >>         sleep(2);
> >>         goto start_socket ; /* close the current connection and wait for new */
> >>     }
> >> ...
> >> ...
> >>
> >>      /* SCTP Events noticiation to listen */
> >>      bzero(&events,sizeof(events));
> >>      events.sctp_data_io_event = 1;
> >>      events.sctp_association_event = 1;
> >>      events.sctp_address_event = 1;
> >>      events.sctp_send_failure_event = 1;
> >>      events.sctp_peer_error_event = 1;
> >>      events.sctp_shutdown_event = 1;
> >>      if(setsockopt(lSoc,IPPROTO_SCTP,SCTP_EVENTS,&events,sizeof(events)) < 0) {
> >>          // Error handling.
> >>      }
> >> ...
> >> ...
> >>
> >>     while (1) {
> >>     start:
> >>     if (server) { //server case
> >>         if((tSoc = accept (lSoc, NULL, 0))  == -1 ) {
> >>         //Error handle
> >>         sleep(2) ;
> >>                 goto start ;
> >>             }
> >>     } else {  // client case
> >>         if ((connect(lSoc, (struct sockaddr *)&servaddr,
> >> sizeof(sockaddr))) != 0) {
> >>             //Error Handling
> >>                }
> >>         }
> >>
> >>    }
> >>
> >> ...
> >> ...
> >> after that when I do
> >>
> >>    if((len = recvmsg(tSoc, msg, MSG_NOSIGNAL)) > 0) {
> >>     // Some operation.
> >>    } /* if (len > 0) */
> >>
> >>    if (len  < 0) {
> >>     if (getsockopt(tSoc, IPPROTO_SCTP,SCTP_STATUS, &status, &socklen) == -1) {
> >>         HERE I GET CONNECTION RESET BY PEER
> >>     }
> >>    }
> >> ************************************
> >>
> >> I have following default setting related to sctp.
> >>
> >> h-3.2# sysctl -a |grep sctp
> >> net.sctp.rto_initial = 3000
> >> net.sctp.rto_min = 1000
> >> net.sctp.rto_max = 60000
> >> net.sctp.valid_cookie_life = 600000
> >> net.sctp.max_burst = 4
> >> net.sctp.association_max_retrans = 10
> >> net.sctp.sndbuf_policy = 0
> >> net.sctp.rcvbuf_policy = 0
> >> net.sctp.path_max_retrans = 5
> >> net.sctp.max_init_retransmits = 8
> >> net.sctp.hb_interval = 30000
> >> net.sctp.cookie_preserve_enable = 1
> >> net.sctp.rto_alpha_exp_divisor = 3
> >> net.sctp.rto_beta_exp_divisor = 2
> >> net.sctp.addip_enable = 0
> >> net.sctp.prsctp_enable = 1
> >> net.sctp.sack_timeout = 200
> >> net.sctp.sctp_mem = 36171    48230    72342
> >> net.sctp.sctp_rmem = 4096    397500    1543360
> >> net.sctp.sctp_wmem = 4096    16384    1543360
> >> net.sctp.auth_enable = 0
> >> net.sctp.addip_noauth_enable = 0
> >> net.sctp.addr_scope_policy = 1
> >> net.sctp.rwnd_update_shift = 4
> >> net.sctp.max_autoclose = 8589934
> >> sh-3.2#
> >>
> >> ********
> >> Attaching two pcap files.
> >>
> >> 1. Client_localIP.pcap ---> In this file local IP (10.1.1.1) is not in
> >> association.
> >> 2, Client_publicIP.pcap. ---> in this 164.48.1.1 is part of association.
> >>
> >>
> >>
> >> >> I have also tries with "echo "2" >
> >> >> /proc/sys/net/sctp/addr_scope_policy" which even doesn't allow the
> >> >> association with private IP. for both (private & public IP) the server
> >> >> receiver end receives the connection reset by peer.
> >> >>
> >> > Can you provide a tcpdump of this as well?
> >> > Neil
> >> >
> >> >
> >> >> > Neil
> >> >> >
> >> >>
> >> >>
> >> >> --
> >> >> -=vipsy
> >> >> http://through-dlens.blogspot.in
> >> >>
> >>
> > So, I'm somewhat confused here.  I've looked at both the tcpdumps you provided
> > and both traces show that ABORT chunks are generated, the exact same ABORT
> > chunks.
> >
> 
> [Vipul] -- Ohh I got your point. If you are looking at starting ABORT,
> that time server was not running but once IT started there is not
> ABORT in between, (Please forgave me I didn't explain this in my
> previous mail). And if you look into capture with publicIP there are
> continue ABORT in some interval. I don;t know why this ABORT after
> some data transmission.
> 
Ah, ok, I still don't see a significant difference though.  You have have
identical INIT chunks in the client_localIP trace, some of which produce aborts,
some of which result in successful connections. I'm not sure what the difference
is, and neither of them contain your 168.* address in the init chunk (look at
frames 109-112 and 113-116 for comprative examples).  I would suggest looking at
the snmp stats on the peer, or just enable dynamic debugging in the sctp module
to get some clue as to whats going on here.


> 
> > So I'm left wondering why you think one environment works and the other
> > doesn't (unless neither environment works and I just misunderstood you
> > previously).
> >
> > FWIW, the ABORT chunks have no causal data, which suggests they are either the
> > result of the reception of an out of the blue packet, a t5 timer expiration or
> > an init verification failure (I would guess the latter).  What does
> > /proc/net/sctp/snmp show after you get the aborts?
> >
> > Neil
> >
> >>
> 
> I used the Bind before connect to bind only one IP and this fixes the
> issue. Now I cannot see multiple IP in association and no ABORT
> PACKET.
> 
> I'll be able to update with snmp output after a while as I need revert
> the code again to get this.
> 
> (If you think this is worth to discusses why ABORT is coming I'll
> really like to continue this).
> 
> PS: Really thanks a lot to all team for helping me and guiding me.
> 
Don't thank me yet, I'm going on vacation here for a week or so.
> 
> >>
> >> --
> >> -=vipsy
> >> http://through-dlens.blogspot.in
> >
> >
> 
> 
> 
> -- 
> -=vipsy
> http://through-dlens.blogspot.in
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux