On Mon, Jul 16, 2012 at 10:32:50PM -0700, David Miller wrote: > From: Neil Horman <nhorman@xxxxxxxxxxxxx> > Date: Mon, 16 Jul 2012 15:13:51 -0400 > > > A few days ago Dave Jones reported this oops: > ... > > It appears from his analysis and some staring at the code that this is likely > > occuring because an association is getting freed while still on the > > sctp_assoc_hashtable. As a result, we get a gpf when traversing the hashtable > > while a freed node corrupts part of the list. > > > > Nominally I would think that an mibalanced refcount was responsible for this, > > but I can't seem to find any obvious imbalance. What I did note however was > > that the two places where we create an association using > > sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths > > which free a newly created association after calling sctp_primitive_ASSOCIATE. > > sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which > > issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to > > the aforementioned hash table. the sctp command interpreter that process side > > effects has not way to unwind previously processed commands, so freeing the > > association from the __sctp_connect or sctp_sendmsg error path would lead to a > > freed association remaining on this hash table. > > > > I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init, > > which allows us to proerly use hlist_unhashed to check if the node is on a > > hashlist safely during a delete. That in turn alows us to safely call > > sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths > > before freeing them, regardles of what the associations state is on the hash > > list. > > > > I noted, while I was doing this, that the __sctp_unhash_endpoint was using > > hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes > > pointers to make that function work properly, so I fixed that up in a simmilar > > fashion. > > > > I attempted to test this using a virtual guest running the SCTP_RR test from > > netperf in a loop while running the trinity fuzzer, both in a loop. I wasn't > > able to recreate the problem prior to this fix, nor was I able to trigger the > > failure after (neither of which I suppose is suprising). Given the trace above > > however, I think its likely that this is what we hit. > > > > Signed-off-by: Neil Horman <nhorman@xxxxxxxxxxxxx> > > Reported-by: davej@xxxxxxxxxx > > Looks great, applied and queued up for -stable, thanks Neil. > Thanks Dave! Neil -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html