On 6/14/21 4:37 PM, Pavel Begunkov wrote: > There are two main lines intervened. The first one is pt.2 of ctx field > shuffling for better caching. There is a couple of things left on that > front. > > The second is optimising (assumably) rarely used offset-based timeouts > and draining. There is a downside (see 12/12), which will be fixed > later. In plans to queue a task_work clearing drain_used (under > uring_lock) from io_queue_deferred() once all drainee are gone. > > nops(batch=32): > 15.9 MIOPS vs 17.3 MIOPS > nullblk (irqmode=2 completion_nsec=0 submit_queues=16), no merges, no stat > 1002 KIOPS vs 1050 KIOPS > > Though the second test is very slow comparing to what I've seen before, > so might be not represantative. Applied, thanks. I'll run this through my testing, too. -- Jens Axboe