> -----Original Message----- > From: Vitaly Kuznetsov > Sent: Monday, July 13, 2015 20:19 > Subject: [PATCH] Drivers: hv: vmbus: prevent new subchannel creation on device > shutdown > > When a new subchannel offer from host comes during device shutdown (e.g. > when a netvsc/storvsc module is unloadedshortly after it was loaded) a > crash can happen as vmbus_process_offer() is not anyhow serialized with > vmbus_remove(). How about vmbus_onoffer_rescind()? It's not serialized with vmbus_remove() either, so I think there is an issue too? I remember when 'rmmod hv_netvsc', we get a rescind-offer message for each subchannel. > As an example we can try calling subchannel create > callback when the module is already unloaded. > The following crash was observed while keeping loading/unloading hv_netvsc > module on 64-CPU guest: > > hv_netvsc vmbus_14: net device safe to remove > BUG: unable to handle kernel paging request at 000000000000a118 > IP: [<ffffffffa01c1b21>] netvsc_sc_open+0x31/0xb0 [hv_netvsc] > PGD 1f3946a067 PUD 1f38a5f067 PMD 0 > Oops: 0000 [#1] SMP > ... > Call Trace: > [<ffffffffa0055cd7>] vmbus_onoffer+0x477/0x530 [hv_vmbus] > [<ffffffff81092e1f>] ? move_linked_works+0x5f/0x80 > [<ffffffffa0055fd3>] vmbus_onmessage+0x33/0xa0 [hv_vmbus] > [<ffffffffa0052f81>] vmbus_onmessage_work+0x21/0x30 [hv_vmbus] > [<ffffffff81095cde>] process_one_work+0x18e/0x4e0 > ... > > The issue cannot be solved by just resetting sc_creation_callback on > driver removal as while we search for the parent channel with channel_lock > held we release it after the channel was found and it can disapper beneath > our feet while we're still in vmbus_process_offer(); > > Introduce new sc_create_lock mutex and take it in vmbus_remove() to ensure > no new subchannels are created after we started the removal procedure. > Check its state with mutex_trylock in vmbus_process_offer(). In my 8-CPU VM, I can very easily reproduce the panic by 1. running while ((1)); do modprobe -r hv_netvsc; modprobe hv_netvsc; sleep 10; done. and 2. in vmbus_onoffer_rescind(), we sleep 3s after a subchannel is added into the primary channel's sc_list (and before the sc_creation_callback is invoked): (I added line 275) 262 if (!fnew) { 263 /* 264 * Check to see if this is a sub-channel. 265 */ 266 if (newchannel->offermsg.offer.sub_channel_index != 0) { 267 /* 268 * Process the sub-channel. 269 */ 270 newchannel->primary_channel = channel; 271 spin_lock_irqsave(&channel->lock, flags); 272 list_add_tail(&newchannel->sc_list, &channel->sc_list); 273 channel->num_sc++; 274 spin_unlock_irqrestore(&channel->lock, flags); 275 ssleep(3); 276 } else 277 goto err_free_chan; 278 } _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel