"Andrea Parri (Microsoft)" <parri.andrea@xxxxxxxxx> writes: > The offer and rescind works are currently scheduled on the so called > "connect CPU". However, this is not really needed: we can synchronize > the works by relying on the usage of the offer_in_progress counter and > of the channel_mutex mutex. This synchronization is already in place. > So, remove this unnecessary "bind to the connect CPU" constraint and > update the inline comments accordingly. > > Suggested-by: Dexuan Cui <decui@xxxxxxxxxxxxx> > Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@xxxxxxxxx> > --- > drivers/hv/channel_mgmt.c | 21 ++++++++++++++++----- > drivers/hv/vmbus_drv.c | 39 ++++++++++++++++++++++++++++----------- > 2 files changed, 44 insertions(+), 16 deletions(-) > > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c > index 0370364169c4e..1191f3d76d111 100644 > --- a/drivers/hv/channel_mgmt.c > +++ b/drivers/hv/channel_mgmt.c > @@ -1025,11 +1025,22 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr) > * offer comes in first and then the rescind. > * Since we process these events in work elements, > * and with preemption, we may end up processing > - * the events out of order. Given that we handle these > - * work elements on the same CPU, this is possible only > - * in the case of preemption. In any case wait here > - * until the offer processing has moved beyond the > - * point where the channel is discoverable. > + * the events out of order. We rely on the synchronization > + * provided by offer_in_progress and by channel_mutex for > + * ordering these events: > + * > + * { Initially: offer_in_progress = 1 } > + * > + * CPU1 CPU2 > + * > + * [vmbus_process_offer()] [vmbus_onoffer_rescind()] > + * > + * LOCK channel_mutex WAIT_ON offer_in_progress == 0 > + * DECREMENT offer_in_progress LOCK channel_mutex > + * INSERT chn_list SEARCH chn_list > + * UNLOCK channel_mutex UNLOCK channel_mutex > + * > + * Forbids: CPU2's SEARCH from *not* seeing CPU1's INSERT WAIT_ON offer_in_progress == 0 LOCK channel_mutex seems to be racy: what happens if offer_in_progress increments after we read it but before we managed to aquire channel_mutex? I think this shold be changed to LOCK channel_mutex CHECK offer_in_progress == 0 EQUAL? GOTO proceed with rescind handling NOT EQUAL? WHILE offer_in_progress) != 0 { UNLOCK channel_mutex MSLEEP(1) LOCK channel_mutex } proceed with rescind handling: ... UNLOCK channel_mutex > */ > > while (atomic_read(&vmbus_connection.offer_in_progress) != 0) { > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c > index 7600615e13754..903b1ec6a259e 100644 > --- a/drivers/hv/vmbus_drv.c > +++ b/drivers/hv/vmbus_drv.c > @@ -1048,8 +1048,9 @@ void vmbus_on_msg_dpc(unsigned long data) > /* > * The host can generate a rescind message while we > * may still be handling the original offer. We deal with > - * this condition by ensuring the processing is done on the > - * same CPU. > + * this condition by relying on the synchronization provided > + * by offer_in_progress and by channel_mutex. See also the > + * inline comments in vmbus_onoffer_rescind(). > */ > switch (hdr->msgtype) { > case CHANNELMSG_RESCIND_CHANNELOFFER: > @@ -1071,16 +1072,34 @@ void vmbus_on_msg_dpc(unsigned long data) > * work queue: the RESCIND handler can not start to > * run before the OFFER handler finishes. > */ > - schedule_work_on(VMBUS_CONNECT_CPU, > - &ctx->work); > + schedule_work(&ctx->work); > break; > > case CHANNELMSG_OFFERCHANNEL: > + /* > + * The host sends the offer message of a given channel > + * before sending the rescind message of the same > + * channel. These messages are sent to the guest's > + * connect CPU; the guest then starts processing them > + * in the tasklet handler on this CPU: > + * > + * VMBUS_CONNECT_CPU > + * > + * [vmbus_on_msg_dpc()] > + * atomic_inc() // CHANNELMSG_OFFERCHANNEL > + * queue_work() > + * ... > + * [vmbus_on_msg_dpc()] > + * schedule_work() // CHANNELMSG_RESCIND_CHANNELOFFER > + * > + * We rely on the memory-ordering properties of the > + * queue_work() and schedule_work() primitives, which > + * guarantee that the atomic increment will be visible > + * to the CPUs which will execute the offer & rescind > + * works by the time these works will start execution. > + */ > atomic_inc(&vmbus_connection.offer_in_progress); > - queue_work_on(VMBUS_CONNECT_CPU, > - vmbus_connection.work_queue, > - &ctx->work); > - break; > + fallthrough; > > default: > queue_work(vmbus_connection.work_queue, &ctx->work); > @@ -1124,9 +1143,7 @@ static void vmbus_force_channel_rescinded(struct vmbus_channel *channel) > > INIT_WORK(&ctx->work, vmbus_onmessage_work); > > - queue_work_on(VMBUS_CONNECT_CPU, > - vmbus_connection.work_queue, > - &ctx->work); > + queue_work(vmbus_connection.work_queue, &ctx->work); > } > #endif /* CONFIG_PM_SLEEP */ -- Vitaly