> From: Long Li > Sent: Wednesday, January 31, 2018 12:23 PM > To: Michael Kelley (EOSG) <Michael.H.Kelley@xxxxxxxxxxxxx>; KY Srinivasan > <kys@xxxxxxxxxxxxx>; Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>; > martin.petersen@xxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > linux-scsi@xxxxxxxxxxxxxxx; James E . J . Bottomley <jejb@xxxxxxxxxxxxxxxxxx> > Subject: RE: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a channel for I/O > requests > > > Subject: RE: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a > > channel for I/O requests > > > > Updated/corrected two email addresses ... > > > > > -----Original Message----- > > > From: Michael Kelley (EOSG) > > > Sent: Wednesday, January 24, 2018 2:14 PM > > > To: KY Srinivasan <kys@xxxxxxxxxxxxx>; Stephen Hemminger > > > <sthemmin@xxxxxxxxxxxxx>; martin.petersen@xxxxxxxxxx; > > > longi@xxxxxxxxxxxxx; JBottomley@xxxxxxxx; > > > devel@xxxxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > > > linux-scsi@xxxxxxxxxxxxxxx > > > Cc: Michael Kelley (EOSG) <Michael.H.Kelley@xxxxxxxxxxxxx> > > > Subject: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a > > > channel for I/O requests > > > > > > Update the algorithm in storvsc_do_io to look for a channel starting > > > with the current CPU + 1 and wrap around (within the current NUMA > > > node). This spreads VMbus interrupts more evenly across CPUs. Previous > > > code always started with first CPU in the current NUMA node, skewing > > > the interrupt load to that CPU. > > > > > > Signed-off-by: Michael Kelley <mikelley@xxxxxxxxxxxxx> > > > --- > > > drivers/scsi/storvsc_drv.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > > index e07907d..f3264c4 100644 > > > --- a/drivers/scsi/storvsc_drv.c > > > +++ b/drivers/scsi/storvsc_drv.c > > > @@ -1310,7 +1310,8 @@ static int storvsc_do_io(struct hv_device *device, > > > */ > > > cpumask_and(&alloced_mask, &stor_device- > > >alloced_cpus, > > > > > cpumask_of_node(cpu_to_node(q_num))); > > > - for_each_cpu(tgt_cpu, &alloced_mask) { > > > + for_each_cpu_wrap(tgt_cpu, &alloced_mask, > > > + outgoing_channel->target_cpu + 1) { > > Does it work when target_cpu is the last CPU on the system? > > Otherwise, looking good. Yes, it works. for_each_cpu_wrap() correctly wraps in the case where the 3rd parameter ('start') is one past the end of the mask. Arguably, we shouldn't rely on that, and should do the wrap to 0 before calling for_each_cpu_wrap(). > > > > if (tgt_cpu != outgoing_channel->target_cpu) > > { > > > outgoing_channel = > > > stor_device->stor_chns[tgt_cpu]; > > > -- > > > 1.8.3.1