On Mon, Jul 16, 2012 at 02:39:30PM -0400, Neil Horman wrote: > A few days ago Dave Jones reported this oops: > > [22766.294255] general protection fault: 0000 [#1] PREEMPT SMP > [22766.295376] CPU 0 > [22766.295384] Modules linked in: > [22766.387137] ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90 > ffff880147c03a74 > [22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000 > [22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000, > [22766.387137] Stack: > [22766.387140] ffff880147c03a10 > [22766.387140] ffffffffa169f2b6 > [22766.387140] ffff88013ed95728 > [22766.387143] 0000000000000002 > [22766.387143] 0000000000000000 > [22766.387143] ffff880003fad062 > [22766.387144] ffff88013c120000 > [22766.387144] > [22766.387145] Call Trace: > [22766.387145] <IRQ> > [22766.387150] [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0 > [sctp] > [22766.387154] [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp] > [22766.387157] [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp] > [22766.387161] [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0 > [22766.387163] [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210 > [22766.387166] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0 > [22766.387168] [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0 > [22766.387169] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0 > [22766.387171] [<ffffffff81590a07>] ip_local_deliver+0x47/0x80 > [22766.387172] [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680 > [22766.387174] [<ffffffff81590c54>] ip_rcv+0x214/0x320 > [22766.387176] [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910 > [22766.387178] [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910 > [22766.387180] [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40 > [22766.387182] [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0 > [22766.387183] [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440 > [22766.387185] [<ffffffff81559280>] napi_skb_finish+0x70/0xa0 > [22766.387187] [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130 > [22766.387218] [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e] > [22766.387242] [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e] > [22766.387266] [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e] > [22766.387268] [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0 > [22766.387270] [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130 > [22766.387273] [<ffffffff810734d0>] __do_softirq+0xe0/0x420 > [22766.387275] [<ffffffff8169826c>] call_softirq+0x1c/0x30 > [22766.387278] [<ffffffff8101db15>] do_softirq+0xd5/0x110 > [22766.387279] [<ffffffff81073bc5>] irq_exit+0xd5/0xe0 > [22766.387281] [<ffffffff81698b03>] do_IRQ+0x63/0xd0 > [22766.387283] [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f > [22766.387283] <EOI> > [22766.387284] > [22766.387285] [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b > [22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48 > 89 e5 48 83 > ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00 > 48 89 fb > 49 89 f5 66 c1 c0 08 66 39 46 02 > [22766.387307] > [22766.387307] RIP > [22766.387311] [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp] > [22766.387311] RSP <ffff880147c039b0> > [22766.387142] ffffffffa16ab120 > [22766.599537] ---[ end trace 3f6dae82e37b17f5 ]--- > [22766.601221] Kernel panic - not syncing: Fatal exception in interrupt > > It appears from his analysis and some staring at the code that this is likely > occuring because an association is getting freed while still on the > sctp_assoc_hashtable. As a result, we get a gpf when traversing the hashtable > while a freed node corrupts part of the list. > > Nominally I would think that an mibalanced refcount was responsible for this, > but I can't seem to find any obvious imbalance. What I did note however was > that the two places where we create an association using > sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths > which free a newly created association after calling sctp_primitive_ASSOCIATE. > sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which > issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to > the aforementioned hash table. the sctp command interpreter that process side > effects has not way to unwind previously processed commands, so freeing the > association from the __sctp_connect or sctp_sendmsg error path would lead to a > freed association remaining on this hash table. > > I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init, > which allows us to proerly use hlist_unhashed to check if the node is on a > hashlist safely during a delete. That in turn alows us to safely call > sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths > before freeing them, regardles of what the associations state is on the hash > list. > > I noted, while I was doing this, that the __sctp_unhash_endpoint was using > hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes > pointers to make that function work properly, so I fixed that up in a simmilar > fashion. > > I attempted to test this using a virtual guest running the SCTP_RR test from > netperf in a loop while running the trinity fuzzer, both in a loop. I wasn't > able to recreate the problem prior to this fix, nor was I able to trigger the > failure after (neither of which I suppose is suprising). Given the trace above > however, I think its likely that this is what we hit. > > Signed-off-by: Neil Horman <nhorman@xxxxxxxxxxxxx> > Reported-by: davej@xxxxxxxxxx > CC: davej@xxxxxxxxxx > CC: "David S. Miller" <davem@xxxxxxxxxxxxx> > CC: Vlad Yasevich <vyasevich@xxxxxxxxx> > CC: Sridhar Samudrala <sri@xxxxxxxxxx> > CC: linux-sctp@xxxxxxxxxxxxxxx > --- > net/sctp/input.c | 7 ++----- > net/sctp/socket.c | 8 +++++++- > 2 files changed, 9 insertions(+), 6 deletions(-) > > diff --git a/net/sctp/input.c b/net/sctp/input.c > index f050d45..05994374 100644 > --- a/net/sctp/input.c > +++ b/net/sctp/input.c > @@ -752,15 +752,12 @@ static void __sctp_unhash_endpoint(struct sctp_endpoint *ep) > > epb = &ep->base; > > - if (hlist_unhashed(&epb->node)) > - return; > - > epb->hashent = sctp_ep_hashfn(epb->bind_addr.port); > > head = &sctp_ep_hashtable[epb->hashent]; > > sctp_write_lock(&head->lock); > - __hlist_del(&epb->node); > + hlist_del_init(&epb->node); > sctp_write_unlock(&head->lock); > } > > @@ -841,7 +838,7 @@ static void __sctp_unhash_established(struct sctp_association *asoc) > head = &sctp_assoc_hashtable[epb->hashent]; > > sctp_write_lock(&head->lock); > - __hlist_del(&epb->node); > + hlist_del_init(&epb->node); > sctp_write_unlock(&head->lock); > } > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > index b3b8a8d..d740db4 100644 > --- a/net/sctp/socket.c > +++ b/net/sctp/socket.c > @@ -1231,8 +1231,14 @@ out_free: > SCTP_DEBUG_PRINTK("About to exit __sctp_connect() free asoc: %p" > " kaddrs: %p err: %d\n", > asoc, kaddrs, err); > - if (asoc) > + if (asoc) { > + /* sctp_primitive_ASSOCIATE may have added this association > + * To the hash table, try to unhash it, just in case, its a noop > + * if it wasn't hashed so we're safe > + */ > + sctp_unhash_established(asoc); > sctp_association_free(asoc); > + } > return err; > } > > -- > 1.7.7.6 > > Damn, self, nak, I missed comitting the bits for __sctp_connect. I'll repost in a bit. Sorry. Neil -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html