Hi, Tao Ren <rentao.bupt@xxxxxxxxx> writes: >> > diff --git a/drivers/usb/gadget/udc/aspeed-vhub/core.c b/drivers/usb/gadget/udc/aspeed-vhub/core.c >> > index cdf96911e4b1..be7bb64e3594 100644 >> > --- a/drivers/usb/gadget/udc/aspeed-vhub/core.c >> > +++ b/drivers/usb/gadget/udc/aspeed-vhub/core.c >> > @@ -135,13 +135,9 @@ static irqreturn_t ast_vhub_irq(int irq, void *data) >> > >> > /* Handle device interrupts */ >> > if (istat & vhub->port_irq_mask) { >> > - unsigned long bitmap = istat; >> > - int offset = VHUB_IRQ_DEV1_BIT; >> > - int size = VHUB_IRQ_DEV1_BIT + vhub->max_ports; >> > - >> > - for_each_set_bit_from(offset, &bitmap, size) { >> > - i = offset - VHUB_IRQ_DEV1_BIT; >> > - ast_vhub_dev_irq(&vhub->ports[i].dev); >> > + for (i = 0; i < vhub->max_ports; i++) { >> > + if (istat & VHUB_DEV_IRQ(i)) >> > + ast_vhub_dev_irq(&vhub->ports[i].dev); >> >> how have you measured your statement above? for_each_set_bit() does >> exactly what you did. Unless your architecture has an instruction which >> helps finds the next set bit (like cls on ARM), which, then, makes it >> much faster. > > I did some testing and result shows for() loop runs faster than > for_each_set_bit() loop. Please refer to details below (discussion with > Benjamin in the original patch) and kindly let me know your > suggestions. no strong feelings, just surprised you're already worried about 20~40 cycles of cpu time ;-) patch queued for next merge window -- balbi