> -----Original Message----- > From: Dexuan Cui > Sent: Thursday, February 5, 2015 4:45 AM > To: Vitaly Kuznetsov; KY Srinivasan > Cc: devel@xxxxxxxxxxxxxxxxxxxxxx; Haiyang Zhang; linux- > kernel@xxxxxxxxxxxxxxx; Jason Wang > Subject: RE: [PATCH 0/4] Drivers: hv: Further protection for the rescind path > > > -----Original Message----- > > From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx] > > Sent: Thursday, February 5, 2015 18:10 PM > > To: KY Srinivasan > > Cc: devel@xxxxxxxxxxxxxxxxxxxxxx; Haiyang Zhang; > > linux-kernel@xxxxxxxxxxxxxxx; Dexuan Cui; Jason Wang > > Subject: Re: [PATCH 0/4] Drivers: hv: Further protection for the > > rescind path > > > > KY Srinivasan <kys@xxxxxxxxxxxxx> writes: > > > > >> -----Original Message----- > > >> From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx] > > >> Sent: Tuesday, February 3, 2015 9:01 AM > > >> To: KY Srinivasan; devel@xxxxxxxxxxxxxxxxxxxxxx > > >> Cc: Haiyang Zhang; linux-kernel@xxxxxxxxxxxxxxx; Dexuan Cui; Jason > > >> Wang > > >> Subject: [PATCH 0/4] Drivers: hv: Further protection for the > > >> rescind path > > >> > > >> This series is a continuation of the "Drivers: hv: vmbus: serialize > > >> Offer and Rescind offer". I'm trying to address a number of > > >> theoretically possible issues with rescind offer handling. All > > >> these complications come from the fact that a rescind offer results > > >> in vmbus channel being freed and we must ensure nobody still uses > > >> it. Instead of introducing new locks I suggest we switch channels usage > to the get/put workflow. > > >> > > >> The main part of the series is [PATCH 1/4] which introduces the > > >> workflow for vmbus channels, all other patches fix different corner > > >> cases using this workflow. I'm not sure all such cases are covered > > >> with this series (probably not), but in case protection is required > > >> in some other places it should become relatively easy to add one. > > >> > > >> I did some sanity testing with CONFIG_DEBUG_LOCKDEP=y and nothing > > >> popped out, however, additional testing would be much appreciated. > > >> > > >> K.Y., Haiyang, I'm not sending this series to netdev@ and > > >> linux-scsi@ as it is supposed to be applied as a whole, please > > >> resend these patches with your sign-offs when (and if) we're done with > reviews. Thanks! > > > > > > Vitaly, > > > > > > Thanks for looking into this issue. While today, rescind offer > > > results in the > > freeing of the channel, I don't think > > > that is required. By not freeing up the channel in the rescind path, > > > we can have > > a safe way to access the channel and > > > that does not have to involve taking a reference on the channel > > > every time you > > access it - the get/put workflow in your > > > patch set. As part of the network performance improvement work, I > > > had > > eliminated all locks in the receive path by setting > > > up per-cpu data structures for mapping the relid to channel etc. > > > These set of > > patches introduces locking/atomic operations > > > in performance critical code paths to deal with an event that is > > > truly rare - the channel getting rescinded. > > > > It is possible to eliminate all locks/atomic operations from > > performance critical pyth in my patch series by following Dexuan's > > suggestion - we'll get the channel in vmbus_open and put it in > > vmbus_close (and on processing offer/rescind offer) this won't affect > > performance. I'm in the middle of testing this approach. > > > > > > > > All channel messages are handled in a single work context: > > > vmbus_on_msg_dpc() -> vmbus_onmessage_work()-> Various channel > > messages [offer, rescind etc.] > This is true. > > > > > > > So, the rescind message cannot be processed while we are processing > > > the > > offer message and since an offer > > > cannot be rescinded before it is offered, offer and rescind are > > > naturally serialized (I think I have patchset in my queue > IMO this may not be true. > The cause is: > (I'm using the latest linux-next repo, which includes Vitaly's patch > "Drivers: hv: vmbus: serialize Offer and Rescind offer".) > > vmbus_onoffer_rescind() runs in vmbus_connection.work_queue, but > vmbus_process_offer() runs in the per-channel newchannel->controlwq, so > the two functions are not serialized, at least in theory. > > As a result, in vmbus_process_offer(): after the new channel is added into > vmbus_connection.chn_list, but before the channel is completely initialized > by us (we need to create a vmbus device and associate the device with the > channel -- this procedure could fail and we goto err_free_chan and free the > channel directly!), vmbus_onoffer_rescind() can see the new channel, but > doesn't know the channel could be freed in another place at the same time. > > BTW, when vmbus_process_offer() -> vmbus_device_create() fails, we goto > err_free_chan without removing the new channel from > vmbus_connection.chn_list? > > As another result : in vmbus_process_offer(), in the case > vmbus_device_register() fails, we'll run "list_del(&newchannel->listentry) > and unlock vmbus_connection.channel_lock" -- just after these 2 lines, at > this time, > vmbus_onoffer_rescind() -> relid2channel() can return NULL, and we'll miss > the rescind message, i.e., we fail to send the > CHANNELMSG_RELID_RELEASED message to the host. This is the way the code is; but it does not have to be. Here is a patch that implements what I have been trying to describe. This is a rough proposal; there is additional (obvious) cleanup that I have not done. This is on top of the patches that have been submitted to Greg. Signed-off-by: K. Y. Srinivasan <kys@xxxxxxxxxxxxx> --- drivers/hv/channel.c | 8 +++++ drivers/hv/channel_mgmt.c | 63 +++++++++++++------------------------------- drivers/hv/hv_util.c | 2 +- drivers/hv/vmbus_drv.c | 24 ++++++++++++----- include/linux/hyperv.h | 1 + 5 files changed, 46 insertions(+), 52 deletions(-) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index bf0cf8f..4a21cd8 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -501,6 +501,14 @@ static int vmbus_close_internal(struct vmbus_channel *channel) put_cpu(); } + /* + * If the channel has been rescinded; process device removal. + */ + if (channel->rescind) { + hv_process_device_removal(channel); + return 0; + } + /* Send a closing message */ msg = &channel->close_msg.msg; diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 0ba6b5c..a208256 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -207,27 +207,11 @@ static void percpu_channel_deq(void *arg) list_del(&channel->percpu_list); } -/* - * vmbus_process_rescind_offer - - * Rescind the offer by initiating a device removal - */ -static void vmbus_process_rescind_offer(struct work_struct *work) +void hv_process_device_removal(struct vmbus_channel *channel) { - struct vmbus_channel *channel = container_of(work, - struct vmbus_channel, - work); + struct vmbus_channel_relid_released msg; unsigned long flags; struct vmbus_channel *primary_channel; - struct vmbus_channel_relid_released msg; - struct device *dev; - - if (channel->device_obj) { - dev = get_device(&channel->device_obj->device); - if (dev) { - vmbus_device_unregister(channel->device_obj); - put_device(dev); - } - } memset(&msg, 0, sizeof(struct vmbus_channel_relid_released)); msg.child_relid = channel->offermsg.child_relid; @@ -256,6 +240,7 @@ static void vmbus_process_rescind_offer(struct work_struct *work) free_channel(channel); } + void vmbus_free_channels(void) { struct vmbus_channel *channel; @@ -270,11 +255,8 @@ void vmbus_free_channels(void) * vmbus_process_offer - Process the offer by creating a channel/device * associated with this offer */ -static void vmbus_process_offer(struct work_struct *work) +static void vmbus_process_offer(struct vmbus_channel *newchannel) { - struct vmbus_channel *newchannel = container_of(work, - struct vmbus_channel, - work); struct vmbus_channel *channel; bool fnew = true; bool enq = false; @@ -340,7 +322,7 @@ static void vmbus_process_offer(struct work_struct *work) if (channel->sc_creation_callback != NULL) channel->sc_creation_callback(newchannel); - goto done_init_rescind; + goto done; } goto err_free_chan; @@ -381,15 +363,10 @@ static void vmbus_process_offer(struct work_struct *work) kfree(newchannel->device_obj); goto err_free_chan; } -done_init_rescind: - spin_lock_irqsave(&newchannel->lock, flags); - /* The next possible work is rescind handling */ - INIT_WORK(&newchannel->work, vmbus_process_rescind_offer); - /* Check if rescind offer was already received */ - if (newchannel->rescind) - queue_work(newchannel->controlwq, &newchannel->work); - spin_unlock_irqrestore(&newchannel->lock, flags); + +done: return; + err_free_chan: free_channel(newchannel); } @@ -515,8 +492,7 @@ static void vmbus_onoffer(struct vmbus_channel_message_header *hdr) newchannel->monitor_grp = (u8)offer->monitorid / 32; newchannel->monitor_bit = (u8)offer->monitorid % 32; - INIT_WORK(&newchannel->work, vmbus_process_offer); - queue_work(newchannel->controlwq, &newchannel->work); + vmbus_process_offer(newchannel); } /* @@ -529,6 +505,7 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr) struct vmbus_channel_rescind_offer *rescind; struct vmbus_channel *channel; unsigned long flags; + struct device *dev; rescind = (struct vmbus_channel_rescind_offer *)hdr; channel = relid2channel(rescind->child_relid); @@ -539,18 +516,16 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr) spin_lock_irqsave(&channel->lock, flags); channel->rescind = true; - /* - * channel->work.func != vmbus_process_rescind_offer means we are still - * processing offer request and the rescind offer processing should be - * postponed. It will be done at the very end of vmbus_process_offer() - * as rescind flag is being checked there. - */ - if (channel->work.func == vmbus_process_rescind_offer) - /* work is initialized for vmbus_process_rescind_offer() from - * vmbus_process_offer() where the channel got created */ - queue_work(channel->controlwq, &channel->work); - spin_unlock_irqrestore(&channel->lock, flags); + + if (channel->device_obj) { + dev = get_device(&channel->device_obj->device); + if (dev) { + vmbus_device_unregister(channel->device_obj); + put_device(dev); + } + } + } /* diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c index c5be773..7994ec2 100644 --- a/drivers/hv/hv_util.c +++ b/drivers/hv/hv_util.c @@ -380,9 +380,9 @@ static int util_remove(struct hv_device *dev) { struct hv_util_service *srv = hv_get_drvdata(dev); - vmbus_close(dev->channel); if (srv->util_deinit) srv->util_deinit(); + vmbus_close(dev->channel); kfree(srv->recv_buffer); return 0; diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 3cd44ae..64627c4 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -505,15 +505,25 @@ static int vmbus_probe(struct device *child_device) */ static int vmbus_remove(struct device *child_device) { - struct hv_driver *drv = drv_to_hv_drv(child_device->driver); + struct hv_driver *drv; struct hv_device *dev = device_to_hv_device(child_device); - if (drv->remove) - drv->remove(dev); - else - pr_err("remove not set for driver %s\n", - dev_name(child_device)); - + if (child_device->driver) { + drv = drv_to_hv_drv(child_device->driver); + if (drv->remove) { + drv->remove(dev); + } else { + pr_err("remove not set for driver %s\n", + dev_name(child_device)); + hv_process_device_removal(dev->channel); + } + } else { + /* + * We don't have a driver for this device; deal with the + * rescind message. + */ + hv_process_device_removal(dev->channel); + } return 0; } diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index b079abf..9658bfe 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1226,6 +1226,7 @@ void hv_kvp_onchannelcallback(void *); int hv_vss_init(struct hv_util_service *); void hv_vss_deinit(void); void hv_vss_onchannelcallback(void *); +void hv_process_device_removal(struct vmbus_channel *channel); extern struct resource *hyperv_mmio; -- 1.7.4.1 _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel