On Mon, Dec 14, 2020 at 07:01:31PM +0000, Pavel Begunkov wrote: > On 14/12/2020 18:23, Keith Busch wrote: > > The existing block layer polling semantics doesn't poll for a specific > > request. Please see the blk_mq_ops driver API for the 'poll' function. > > It takes a hardware context, which does not indicate a specific request. > > See also the blk_poll() function, which doesn't consider any specific > > request in order to break out of the polling loop. > > Yeah, thanks for pointing out, it's just the users do it that way -- > block layer dio and somewhat true for io_uring, and also hybrid part is > per request based (and sleeps once per request), that stands out. > If would go with coml-to-compl it should be changed. And not to forget > that subm-to-compl sometimes is more desirable. Right, so coming full circle to my initial reply: the block polling thread may be responsible for multiple requests when it wakes up, yet the hybrid sleep timer considers only one; therefore, the sleep criteria is not always accurate and is worse than interrupt driven at high q depth. The current sleep calculation works fine for QD1, but I don't see a clear way to calculate an accurate sleep time for higher q-depths within a reasonable CPU cost. My only suggestion is just don't sleep at all as long as the polling thread continues to reap completions on its first poll.