So it looks like we are super not efficient because most of the times we catch 1 completion per interrupt and the whole point is that we need to find more! This fio is single threaded with QD=32 so I'd expect that we be somewhere in 8-31 almost all the time... I also tried QD=1024, histogram is still the same. It looks like it takes you longer to submit an I/O than to service an interrupt,
Well, with irq-poll we do practically nothing in the interrupt handler, only schedule irq-poll. Note that the latency measures are only from the point the interrupt arrives and the point we actually service it by polling for completions.
so increasing queue depth in the singe-threaded case doesn't make much difference. You might want to try multiple threads per core with QD, say, 32
This is how I ran, QD=32. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html