On Thu, Mar 26, 2020 at 03:16:21PM +0100, Vitaly Kuznetsov wrote: > "Andrea Parri (Microsoft)" <parri.andrea@xxxxxxxxx> writes: > > > The offer and rescind works are currently scheduled on the so called > > "connect CPU". However, this is not really needed: we can synchronize > > the works by relying on the usage of the offer_in_progress counter and > > of the channel_mutex mutex. This synchronization is already in place. > > So, remove this unnecessary "bind to the connect CPU" constraint and > > update the inline comments accordingly. > > > > Suggested-by: Dexuan Cui <decui@xxxxxxxxxxxxx> > > Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@xxxxxxxxx> > > --- > > drivers/hv/channel_mgmt.c | 21 ++++++++++++++++----- > > drivers/hv/vmbus_drv.c | 39 ++++++++++++++++++++++++++++----------- > > 2 files changed, 44 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c > > index 0370364169c4e..1191f3d76d111 100644 > > --- a/drivers/hv/channel_mgmt.c > > +++ b/drivers/hv/channel_mgmt.c > > @@ -1025,11 +1025,22 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr) > > * offer comes in first and then the rescind. > > * Since we process these events in work elements, > > * and with preemption, we may end up processing > > - * the events out of order. Given that we handle these > > - * work elements on the same CPU, this is possible only > > - * in the case of preemption. In any case wait here > > - * until the offer processing has moved beyond the > > - * point where the channel is discoverable. > > + * the events out of order. We rely on the synchronization > > + * provided by offer_in_progress and by channel_mutex for > > + * ordering these events: > > + * > > + * { Initially: offer_in_progress = 1 } > > + * > > + * CPU1 CPU2 > > + * > > + * [vmbus_process_offer()] [vmbus_onoffer_rescind()] > > + * > > + * LOCK channel_mutex WAIT_ON offer_in_progress == 0 > > + * DECREMENT offer_in_progress LOCK channel_mutex > > + * INSERT chn_list SEARCH chn_list > > + * UNLOCK channel_mutex UNLOCK channel_mutex > > + * > > + * Forbids: CPU2's SEARCH from *not* seeing CPU1's INSERT > > WAIT_ON offer_in_progress == 0 > LOCK channel_mutex > > seems to be racy: what happens if offer_in_progress increments after we > read it but before we managed to aquire channel_mutex? Remark that the RESCIND work must see the increment which is performed "before" queueing the work in question (and the associated OFFER work), cf. the comment in vmbus_on_msg_dpc() below and dbb92f88648d6 ("workqueue: Document (some) memory-ordering properties of {queue,schedule}_work()") AFAICT, this suffices to meet the intended behavior as sketched above. I might be missing something of course, can you elaborate on the issue here? Thanks, Andrea > > I think this shold be changed to > > LOCK channel_mutex > CHECK offer_in_progress == 0 > EQUAL? GOTO proceed with rescind handling > NOT EQUAL? > WHILE offer_in_progress) != 0 { > UNLOCK channel_mutex > MSLEEP(1) > LOCK channel_mutex > } > proceed with rescind handling: > ... > UNLOCK channel_mutex > > > */ > > > > while (atomic_read(&vmbus_connection.offer_in_progress) != 0) { > > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c > > index 7600615e13754..903b1ec6a259e 100644 > > --- a/drivers/hv/vmbus_drv.c > > +++ b/drivers/hv/vmbus_drv.c > > @@ -1048,8 +1048,9 @@ void vmbus_on_msg_dpc(unsigned long data) > > /* > > * The host can generate a rescind message while we > > * may still be handling the original offer. We deal with > > - * this condition by ensuring the processing is done on the > > - * same CPU. > > + * this condition by relying on the synchronization provided > > + * by offer_in_progress and by channel_mutex. See also the > > + * inline comments in vmbus_onoffer_rescind(). > > */ > > switch (hdr->msgtype) { > > case CHANNELMSG_RESCIND_CHANNELOFFER: > > @@ -1071,16 +1072,34 @@ void vmbus_on_msg_dpc(unsigned long data) > > * work queue: the RESCIND handler can not start to > > * run before the OFFER handler finishes. > > */ > > - schedule_work_on(VMBUS_CONNECT_CPU, > > - &ctx->work); > > + schedule_work(&ctx->work); > > break; > > > > case CHANNELMSG_OFFERCHANNEL: > > + /* > > + * The host sends the offer message of a given channel > > + * before sending the rescind message of the same > > + * channel. These messages are sent to the guest's > > + * connect CPU; the guest then starts processing them > > + * in the tasklet handler on this CPU: > > + * > > + * VMBUS_CONNECT_CPU > > + * > > + * [vmbus_on_msg_dpc()] > > + * atomic_inc() // CHANNELMSG_OFFERCHANNEL > > + * queue_work() > > + * ... > > + * [vmbus_on_msg_dpc()] > > + * schedule_work() // CHANNELMSG_RESCIND_CHANNELOFFER > > + * > > + * We rely on the memory-ordering properties of the > > + * queue_work() and schedule_work() primitives, which > > + * guarantee that the atomic increment will be visible > > + * to the CPUs which will execute the offer & rescind > > + * works by the time these works will start execution. > > + */ > > atomic_inc(&vmbus_connection.offer_in_progress); > > - queue_work_on(VMBUS_CONNECT_CPU, > > - vmbus_connection.work_queue, > > - &ctx->work); > > - break; > > + fallthrough; > > > > default: > > queue_work(vmbus_connection.work_queue, &ctx->work); > > @@ -1124,9 +1143,7 @@ static void vmbus_force_channel_rescinded(struct vmbus_channel *channel) > > > > INIT_WORK(&ctx->work, vmbus_onmessage_work); > > > > - queue_work_on(VMBUS_CONNECT_CPU, > > - vmbus_connection.work_queue, > > - &ctx->work); > > + queue_work(vmbus_connection.work_queue, &ctx->work); > > } > > #endif /* CONFIG_PM_SLEEP */ > > -- > Vitaly >