Dexuan Cui <decui@xxxxxxxxxxxxx> writes: >> -----Original Message----- >> From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx] >> Sent: Tuesday, April 21, 2015 22:28 >> To: KY Srinivasan >> Cc: Haiyang Zhang; devel@xxxxxxxxxxxxxxxxxxxxxx; linux- >> kernel@xxxxxxxxxxxxxxx; Dexuan Cui >> Subject: [PATCH 6/6] Drivers: hv: vmbus: do a fair round robin when >> selecting an outgoing channel >> >> vmbus_get_outgoing_channel() implements the following algorithm for >> selecting >> an outgoing channel (despite the comment before the function saying it >> distributes the load equally): > > Yeah, I also found the issue. > >> 1) If we have no subchannels return the primary channel; >> 2) If primary->next_oc is grater than primary->num_sc reset the primary- >> >next_oc >> to 0 and return the primary channel; >> 3) Aim for the primary->next_oc subchannel, increment primary->next_oc; >> 4) Loop through all opened subchannels. If we see a channel which has >> target_cpu == current_cpu return it. If we reached the primary->next_oc'th >> open subchannel return it; >> 5) Return the primary channel. >> The implementation also skips the subchannel No. 0 unless it matches the >> current >> cpu as we assign i to 1 in the initialization. >> >> This is not a fair round robin as subchannels in the beginning of the list are >> more likely to be returned and checking for current cpu aslo creates > > I suppose the current algorithm is trying to make use of cache locality? > KY may share more information. > >> additional >> complexity. Simplify the vmbus_get_outgoing_channel() function, make it >> do what the comment before it says. > > Hi Vitaly, > It looks your algorithm also has an issue: > Assuming primary->num_sc == 3 (SC1, SC2, SC3) > 1st time: we choose SC1 and primary->next_oc is set to 1. > 2nd time: we choose SC2 and primary->next_oc is set to 2. > 3rd time: we choose SC3 and primary->next_oc is set to 3. > 4th time and later: since i varies among 1~3 and can't be bigger than 3, > we always choose the primary channel. You're right, it seems 'if (primary->next_oc > primary->num_sc)' is off-by-one, it should be >= > > BTW, IMO it's not easy to achieve complete fairness because > vmbus_get_outgoing_channel() can run simultaneously on different > CPUs (so primary->next_oc can be modified at the same time by multiple > CPUs), and we believe it should be lockless. > Maybe atomic_t can help(?) This thing bothered me a bit but then I realized that we don't actually care - doing a mistake once is better than suffering from the slowness of locks/atomics. I'd suggest we keep it this way (with the fix mentioned above). Thanks! > > Thanks, > -- Dexuan > >> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> >> --- >> drivers/hv/channel_mgmt.c | 27 +++++++++------------------ >> 1 file changed, 9 insertions(+), 18 deletions(-) >> >> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c >> index daa6417..df82442 100644 >> --- a/drivers/hv/channel_mgmt.c >> +++ b/drivers/hv/channel_mgmt.c >> @@ -827,39 +827,30 @@ cleanup: >> struct vmbus_channel *vmbus_get_outgoing_channel(struct >> vmbus_channel *primary) >> { >> struct list_head *cur, *tmp; >> - int cur_cpu; >> struct vmbus_channel *cur_channel; >> - struct vmbus_channel *outgoing_channel = primary; >> - int next_channel; >> - int i = 1; >> + int i = 0; >> >> if (list_empty(&primary->sc_list)) >> - return outgoing_channel; >> + return primary; >> >> - next_channel = primary->next_oc++; >> - >> - if (next_channel > (primary->num_sc)) { >> + if (primary->next_oc > primary->num_sc) { >> primary->next_oc = 0; >> - return outgoing_channel; >> + return primary; >> } >> >> - cur_cpu = hv_context.vp_index[get_cpu()]; >> - put_cpu(); >> list_for_each_safe(cur, tmp, &primary->sc_list) { >> + i++; >> cur_channel = list_entry(cur, struct vmbus_channel, sc_list); >> if (cur_channel->state != CHANNEL_OPENED_STATE) >> continue; >> >> - if (cur_channel->target_vp == cur_cpu) >> - return cur_channel; >> - >> - if (i == next_channel) >> + if (i > primary->next_oc) { >> + primary->next_oc = i; >> return cur_channel; >> - >> - i++; >> + } >> } >> >> - return outgoing_channel; >> + return primary; >> } >> EXPORT_SYMBOL_GPL(vmbus_get_outgoing_channel); >> >> -- >> 1.9.3 -- Vitaly _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel