Re: [PATCH 2/2] io_uring: batch getting pcpu references

Jens Axboe <axboe@xxxxxxxxx> · Tue, 17 Dec 2019 17:02:57 -0700

On 12/17/19 3:28 PM, Pavel Begunkov wrote:
> percpu_ref_tryget() has its own overhead. Instead getting a reference
> for each request, grab a bunch once per io_submit_sqes().
> 
> basic benchmark with submit and wait 128 non-linked nops showed ~5%
> performance gain. (7044 KIOPS vs 7423)

Confirmed about 5% here as well, doing polled IO to a fast device.
That's a huge gain!

-- 
Jens Axboe