On 6/23/13 3:09 AM, Ingo Molnar wrote:
If an IO driver is implemented properly then it will batch up requests for the controller, and gets IRQ-notified on a (sub-)batch of buffers completed. If there's any spinning done then it should be NAPI-alike polling: a single "is stuff completed" polling pass per new block of work submitted, to opportunistically interleave completion with submission work. I don't see where active spinning brings would improve performance compared to a NAPI-alike technique. Your numbers obviously show a speedup we'd like to have, I'm just wondering whether the same speedup (or even more) could be implemented via: - smart batching that rate-limits completion IRQs in essence + NAPI-alike polling ... which would almost never result in IRQ driven completion when we are close to CPU-bound and while not yet saturating the IO controller's capacity. The spinning approach you add has the disadvantage of actively wasting CPU time, which could be used to run other tasks. In general it's much better to make sure the completion IRQs are rate-limited and just schedule. This (combined with a metric ton of fine details) is what the networking code does in essence, and they have no trouble reaching very high throughput.
Networking code has a similar proposal for low latency sockets using polling: https://lwn.net/Articles/540281/
David -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html