On Mon, Mar 11, 2013 at 7:10 PM, Vlad Yasevich <vyasevich@xxxxxxxxx> wrote: > On 03/11/2013 06:44 PM, Karl Heiss wrote: >> >> On Mon, Mar 11, 2013 at 5:59 PM, Vlad Yasevich <vyasevich@xxxxxxxxx> >> wrote: >>> >>> On 03/09/2013 03:19 PM, Karl Heiss wrote: >>>> >>>> >>>> On Fri, Mar 8, 2013 at 12:06 PM, Karl Heiss <kheiss@xxxxxxxxx> wrote: >>>>> >>>>> >>>>> On Fri, Mar 8, 2013 at 11:42 AM, Vlad Yasevich <vyasevich@xxxxxxxxx> >>>>> wrote: >>>>>> >>>>>> >>>>>> On 03/08/2013 10:37 AM, Karl Heiss wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Mar 8, 2013 at 10:31 AM, Vlad Yasevich <vyasevich@xxxxxxxxx> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 03/08/2013 09:31 AM, Karl Heiss wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Mar 8, 2013 at 8:52 AM, Karl Heiss <kheiss@xxxxxxxxx> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Mar 7, 2013 at 6:09 PM, Vlad Yasevich >>>>>>>>>> <vyasevich@xxxxxxxxx> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 03/07/2013 04:51 PM, Karl Heiss wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Mar 7, 2013 at 12:17 PM, Vlad Yasevich >>>>>>>>>>>> <vyasevich@xxxxxxxxx> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 03/07/2013 12:06 PM, Karl Heiss wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> The issue appears to manifest itself when the connection is >>>>>>>>>>>>>> closed >>>>>>>>>>>>>> from the remote end and getsockopt(SCTP_STATUS) is called >>>>>>>>>>>>>> within >>>>>>>>>>>>>> a >>>>>>>>>>>>>> small window in which the association is still valid but >>>>>>>>>>>>>> asoc->peer.primary_path is NULL. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Aha! Thanks. There was a bug in the rcu clean-up that allowed >>>>>>>>>>>>> the >>>>>>>>>>>>> association to remain while all transports have been removed. >>>>>>>>>>>>> >>>>>>>>>>>>> Here is a patch that should have addressed this condition: >>>>>>>>>>>>> >>>>>>>>>>>>> commit 8c98653f05534acd1cb07ea4929702a3659177d1 >>>>>>>>>>>>> Author: Daniel Borkmann <dborkman@xxxxxxxxxx> >>>>>>>>>>>>> Date: Fri Feb 1 04:37:43 2013 +0000 >>>>>>>>>>>>> >>>>>>>>>>>>> sctp: sctp_close: fix release of bindings for deferred >>>>>>>>>>>>> call_rcu's >>>>>>>>>>>>> >>>>>>>>>>>>> Full patch is here: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8c98653f05534acd1cb07ea4929702a3659177d1 >>>>>>>>>>>>> >>>>>>>>>>>>> Make sure that you have this patch in the kernel you are >>>>>>>>>>>>> running >>>>>>>>>>>>> >>>>>>>>>>>>> -vlad >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Unfortunately this patch wont apply to the version of the SCTP >>>>>>>>>>>> stack >>>>>>>>>>>> that we are using (2.6.36.2) since it does not have a >>>>>>>>>>>> sctp_transport_destroy_rcu() function. Is there any chance that >>>>>>>>>>>> simply swapping the order of the instructions without moving >>>>>>>>>>>> them >>>>>>>>>>>> would have any effect? I ask this hypothetically because the >>>>>>>>>>>> race >>>>>>>>>>>> condition window seems to be difficult to recreate, thus nothing >>>>>>>>>>>> to >>>>>>>>>>>> test against (aside from in the field!). >>>>>>>>>>>> >>>>>>>>>>>> Karl >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi Karl >>>>>>>>>>> >>>>>>>>>>> I think I see the problem now. The problem happens when the >>>>>>>>>>> association >>>>>>>>>>> is >>>>>>>>>>> destroyed. We delay removing the association from >>>>>>>>>>> the association id pool until all references on the association >>>>>>>>>>> have dropped. As a result, it is possible (for a very short >>>>>>>>>>> period of time) for an association structure to still exist in >>>>>>>>>>> the kernel and still be found via the association id, but that >>>>>>>>>>> association >>>>>>>>>>> has no transports and is about to be completely destroyed. >>>>>>>>>>> >>>>>>>>>>> This is a really interesting race and I need to figure out if it >>>>>>>>>>> is >>>>>>>>>>> there on purpose or not? >>>>>>>>>>> >>>>>>>>>>> In the mean time, here is a patch that should solve it for you. >>>>>>>>>>> >>>>>>>>>>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c >>>>>>>>>>> index b907073..2d92c89 100644 >>>>>>>>>>> --- a/net/sctp/socket.c >>>>>>>>>>> +++ b/net/sctp/socket.c >>>>>>>>>>> @@ -223,7 +223,7 @@ struct sctp_association *sctp_id2assoc(struct >>>>>>>>>>> sock >>>>>>>>>>> *sk, >>>>>>>>>>> sctp_assoc_t id) >>>>>>>>>>> if (!list_empty(&sctp_sk(sk)->ep->asocs)) >>>>>>>>>>> asoc = >>>>>>>>>>> list_entry(sctp_sk(sk)->ep->asocs.next, >>>>>>>>>>> struct >>>>>>>>>>> sctp_association, >>>>>>>>>>> asocs); >>>>>>>>>>> - return asoc; >>>>>>>>>>> + goto done; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> /* Otherwise this is a UDP-style socket. */ >>>>>>>>>>> @@ -234,6 +234,7 @@ struct sctp_association *sctp_id2assoc(struct >>>>>>>>>>> sock >>>>>>>>>>> *sk, >>>>>>>>>>> sctp_assoc_t id) >>>>>>>>>>> asoc = (struct sctp_association >>>>>>>>>>> *)idr_find(&sctp_assocs_id, >>>>>>>>>>> (int)id); >>>>>>>>>>> spin_unlock_bh(&sctp_assocs_id_lock); >>>>>>>>>>> >>>>>>>>>>> +done: >>>>>>>>>>> if (!asoc || (asoc->base.sk != sk) || >>>>>>>>>>> asoc->base.dead) >>>>>>>>>>> return NULL; >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Vlad, >>>>>>>>>> >>>>>>>>>> Looking at the kdump from the panic, I am seeing that your patch >>>>>>>>>> above >>>>>>>>>> may not work in this case since the asoc is valid, the base.sk is >>>>>>>>>> valid, and base.dead is 0. Unless base.sk is valid but doesn't >>>>>>>>>> match >>>>>>>>>> sk, this wouldn't appear to fix this issue. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hm.. If the association is not marked "dead", it should still have >>>>>>>> all >>>>>>>> its >>>>>>>> transports present. If you look at the peer.transport_addr_list in >>>>>>>> you kdump, is that list empty or not? >>>>>>>> >>>>>>>> Are any other peer transport pointers set (active_path, >>>>>>>> retran_path)? >>>>>>>> >>>>>>> >>>>>>> crash> p ((struct sctp_association *) 0xffff8107670e3000).peer >>>>>>> $14 = { >>>>>>> rwnd = 65535, >>>>>>> transport_addr_list = { >>>>>>> next = 0xffff8107670e3180, >>>>>>> prev = 0xffff8107670e3180 >>>>>>> }, >>>>>>> transport_count = 0, >>>>>>> port = 3868, >>>>>>> primary_path = 0x0, >>>>>>> primary_addr = { >>>>>>> v4 = { >>>>>>> sin_family = 0, >>>>>>> sin_port = 0, >>>>>>> sin_addr = { >>>>>>> s_addr = 0 >>>>>>> }, >>>>>>> __pad = "\000\000\000\000\000\000\000" >>>>>>> }, >>>>>>> v6 = { >>>>>>> sin6_family = 0, >>>>>>> sin6_port = 0, >>>>>>> sin6_flowinfo = 0, >>>>>>> sin6_addr = { >>>>>>> in6_u = { >>>>>>> u6_addr8 = >>>>>>> "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000", >>>>>>> u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, >>>>>>> u6_addr32 = {0, 0, 0, 0} >>>>>>> } >>>>>>> }, >>>>>>> sin6_scope_id = 0 >>>>>>> }, >>>>>>> sa = { >>>>>>> sa_family = 0, >>>>>>> sa_data = >>>>>>> "\000\000\000\000\000\000\000\000\000\000\000\000\000" >>>>>>> } >>>>>>> }, >>>>>>> active_path = 0x0, >>>>>>> retran_path = 0x0, >>>>>>> last_sent_to = 0x0, >>>>>>> last_data_from = 0x0, >>>>>>> tsn_map = { >>>>>>> tsn_map = 0x0, >>>>>>> base_tsn = 0, >>>>>>> cumulative_tsn_ack_point = 0, >>>>>>> max_tsn_seen = 0, >>>>>>> len = 0, >>>>>>> pending_data = 0, >>>>>>> num_dup_tsns = 0, >>>>>>> dup_tsns = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} >>>>>>> }, >>>>>>> sack_needed = 1 '\001', >>>>>>> sack_cnt = 0, >>>>>>> ecn_capable = 0 '\0', >>>>>>> ipv4_address = 1 '\001', >>>>>>> ipv6_address = 0 '\0', >>>>>>> hostname_address = 0 '\0', >>>>>>> asconf_capable = 0 '\0', >>>>>>> prsctp_capable = 0 '\0', >>>>>>> auth_capable = 0 '\0', >>>>>>> adaptation_ind = 0, >>>>>>> addip_disabled_mask = 0, >>>>>>> i = { >>>>>>> init_tag = 0, >>>>>>> a_rwnd = 0, >>>>>>> num_outbound_streams = 0, >>>>>>> num_inbound_streams = 0, >>>>>>> initial_tsn = 0 >>>>>>> }, >>>>>>> cookie_len = 0, >>>>>>> cookie = 0x0, >>>>>>> addip_serial = 0, >>>>>>> peer_random = 0x0, >>>>>>> peer_chunks = 0x0, >>>>>>> peer_hmacs = 0x0 >>>>>>> } >>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>>> Karl >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Vlad, >>>>>>>>> >>>>>>>>> One other thing, with the difficulty we are having recreating this >>>>>>>>> issue, is there any generic way to increase the likelihood for the >>>>>>>>> transport to be cleared out while delaying the association cleanup? >>>>>>>>> Is there any way that the association is initialized without any >>>>>>>>> transport information? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> When the association is initialized, the lists are empty, but the >>>>>>>> next >>>>>>>> thing that happens is that we add transport of the destination we >>>>>>>> are >>>>>>>> sending to or receiving from to the association and mark it as >>>>>>>> primary >>>>>>>> and >>>>>>>> active. All this happens under a socket lock, so getsockopt can't >>>>>>>> access the association until all actions on that association >>>>>>>> complete. >>>>>>>> >>>>>>>> >>>>>>>>> The reason I ask; we believe the issue is >>>>>>>>> happening very shortly after the association is brought up (we >>>>>>>>> bring >>>>>>>>> it up and then do the getsockopt()). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Can you check what the association state is? Alternately, can you >>>>>>>> provide >>>>>>>> the kdump and the kernel so I can dig around. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> crash> p ((struct sctp_association *) 0xffff8107670e3000).state >>>>>>> $15 = SCTP_STATE_CLOSED >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Hi Karl >>>>>> >>>>>> Was this the client or the server side? Also what was the socket type >>>>>> (STREAM or SEQPACKET)? >>>>>> >>>>>> -vlad >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks >>>>>>>> -vlad >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Karl >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>>> We believe this is occurring on the client side (still working on >>>>> confirming, this system is a Diameter router so we get connections >>>>> going in both directions). The connections are all STREAM. We are >>>>> also seeing ABORTs fairly regularly on the connections in suspect. >>>>> >>>>> Karl >>>> >>>> >>>> >>>> So we finally got a capture around the time of the panic. The >>>> panicing system is acting as a server and the client is connecting, >>>> gets through INIT and COOKIE_ECHO, and sends several data packets when >>>> the client sends another INIT. At this point, the server handles the >>>> INIT, starts over and it starts sending data packets again when the >>>> server sends an ABORT because the application doesn't support >>>> restarting the connection. >>> >>> >>> >>> Do you know if this is done through SO_LINGER or with sendmsg and >>> MSG_ABORT? >>> >>> -vlad >>> >>> >>>> It is around this time that the panic >>>> occurs. One thing that I noticed is that the sctp_association >>>> structure looks awfully similar to a temporary association that is >>>> created when an unexpected INIT is received, but before it is >>>> populated with peer information. However the temp value is not set to >>>> 0 as would be expected. >>>> >>>> Karl >>>> >>> >> >> This is done with SO_LINGER. However, we just were able to reproduce the >> issue. >> >> When a duplicate cookie-echo message is received, and the >> sctp_sf_do_5_2_4_dupcook() => sctp_unpack_cookie() is called, it calls >> sctp_association_new() instead of sctp_make_temp_asoc(), and ends up >> creating a full-fledged association instead of one with "temp" set. >> Now, if we enter collision case, the primary path does not get written >> in the association. When the next command is set to SCTP_CMD_NEW_ASOC, >> since the association does not have "temp" marked, it gets added to >> the association hash table and the endpoint. Even when the command >> SCTP_CMD_DELETE_TCB is processed, since the association is not >> temporary, the following check in sctp_cmd_delete_tcb() prevents the >> association from being deleted from the hash table or the endpoint. >> >> if (sctp_style(sk, TCP) && sctp_sstate(sk, LISTENING) && >> (!asoc->temp) && (sk->sk_shutdown != SHUTDOWN_MASK)) >> return; >> >> sctp_unhash_established(asoc); >> <<< never reached >> sctp_association_free(asoc); >> <<< never reached >> >> When we duplicate the traffic using netem, we are able to get this to >> occur when getsockopt(SCTP_STATUS) is called due to the transport >> being NULL. >> >> Karl > > > Hi Karl > > Yep, this is the code I've been looking at as well, just didn't get far > enough. I was focusing the dookcook_a case(). > > I'm attaching a patch (untested) that should fix this. > > -vlad >> >> > Vlad, That looks promising, however SCTP_CMD_SET_ASOC doesn't exist in this (2.6.36.2) SCTP stack. I will look into backporting this side effect state or finding an alternate way of preventing the association from being added to the endpoint. Karl -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html