>Subject: Re: [PATCH 11/11] scsi: storvsc: Re-init stor_chns when a channel >interrupt is re-assigned > >> >@@ -621,6 +621,63 @@ static inline struct storvsc_device >> >*get_in_stor_device( >> > >> > } >> > >> >+void storvsc_change_target_cpu(struct vmbus_channel *channel, u32 >> >+old, >> >+u32 new) { >> >+ struct storvsc_device *stor_device; >> >+ struct vmbus_channel *cur_chn; >> >+ bool old_is_alloced = false; >> >+ struct hv_device *device; >> >+ unsigned long flags; >> >+ int cpu; >> >+ >> >+ device = channel->primary_channel ? >> >+ channel->primary_channel->device_obj >> >+ : channel->device_obj; >> >+ stor_device = get_out_stor_device(device); >> >+ if (!stor_device) >> >+ return; >> >+ >> >+ /* See storvsc_do_io() -> get_og_chn(). */ >> >+ spin_lock_irqsave(&device->channel->lock, flags); >> >+ >> >+ /* >> >+ * Determines if the storvsc device has other channels assigned to >> >+ * the "old" CPU to update the alloced_cpus mask and the stor_chns >> >+ * array. >> >+ */ >> >+ if (device->channel != channel && device->channel->target_cpu == >> >old) { >> >+ cur_chn = device->channel; >> >+ old_is_alloced = true; >> >+ goto old_is_alloced; >> >+ } >> >+ list_for_each_entry(cur_chn, &device->channel->sc_list, sc_list) { >> >+ if (cur_chn == channel) >> >+ continue; >> >+ if (cur_chn->target_cpu == old) { >> >+ old_is_alloced = true; >> >+ goto old_is_alloced; >> >+ } >> >+ } >> >+ >> >+old_is_alloced: >> >+ if (old_is_alloced) >> >+ WRITE_ONCE(stor_device->stor_chns[old], cur_chn); >> >+ else >> >+ cpumask_clear_cpu(old, &stor_device->alloced_cpus); >> >> If the old cpu is not allocated, is it still necessary to do a cpumask_clear_cpu? > >AFAICT, this really depends on how much we "believe" in the current heuristic >(as implemented by get_og_chn()): ;-) > >The cpumask_clear_cpu() (and the below, dependent "flush" as well) are >intended to re-initialize alloced_cpus and stor_chns in order for get_og_chn() >to re-process/update them. > >Also, notice that (both in the current code and after this series) alloced_cpus >can't be offlined and get_og_chn() does rely on this property (cf., e.g., the >loop/check over alloced_cpus/node_mask). > >I suspect that giving up on this invariant/property would require a certain >amount of re-design in the heuristic/code in question... > > >> >@@ -1360,7 +1432,14 @@ static int storvsc_do_io(struct hv_device >*device, >> > } >> > } >> > } else { >> >+ spin_lock_irqsave(&device->channel->lock, flags); >> >+ outgoing_channel = stor_device->stor_chns[q_num]; >> >+ if (outgoing_channel != NULL) { >> >+ spin_unlock_irqrestore(&device->channel->lock, >> >flags); >> >> Checking outgoing_channel again seems unnecessary. Why not just call >get_og_chn()? > >target_cpu_store() might have changed stor_chns (and alloced_cpus) in the >meantime (but before we've acquired the device's lock): the double check is >to make sure we have a "consistent"/an up-to-date view of stor_chns and >alloced_cpus. > > >> >> >+ goto found_channel; >> >+ } >> > outgoing_channel = get_og_chn(stor_device, q_num); >> >+ spin_unlock_irqrestore(&device->channel->lock, flags); >> > } >> >> With device->channel->lock, now we have one more lock on the I/O issuing >path. It doesn't seem optimal as you are trying to protect the code in >storvsc_change_target_cpu(), this doesn't need to block concurrent I/O >issuers. Maybe moving to RCU is a better approach? > >I don't see this as a problem (*and I've validated such conclusion in >experiments, where the "patched kernel" was sometimes performing slighlty >better than the "unpatched kernel" and sometimes slightly >worse...): > >On the one hand, the stor_chns array "stabilizes" quite early after system >initialization in "normal" (i.e., common) situations (i.e., no channel >reassignments, no device hotplugs...); IOW, get_og_chn() really represents >the "rare and slow" path here (but not that slow! >after all...). Furthermore, notice that even in those "rare cases" >the number of "contending" channels is limited to at most 1 per 4 CPUs IIRC >(alloced_cpus is "sparsely populated"...). Yes I realized it is on the slow path. There is no need to optimize locks. Reviewed-by; Long Li <longli@xxxxxxxxxxxxx> > >The latencies of the RCU grace period (in the order of milliseconds) would be a >major concern for the adoption of RCU here (at least, if we continue to >consider get_og_chn() as an "updater"). I'm afraid that this could be "too >slow" even for our slow path... ;-/ > >What am I missing? ;-) > >Thanks, > Andrea