This is the first series of shaving some overhead for wq-offloading. The 1st removes extra allocations, and the 3rd req->refs abusing. There are plenty of opportunities to leak memory similarly to the way mentioned in [PATCH 1/3], and I'm working a generic fix, as I need it to close holes in waiting splice(2) patches. The untold idea behind [PATCH 3/3] is to get rid referencing even further. As submission ref always pin request, there is no need in the second (i.e. completion) ref. Even more, With a bit of retossing, we can get rid of req->refs at all by using non-atomic ref under @compl_lock, which usually will be bundled fill_event(). I'll play with it soon. Any ideas or concerns regarding it? Regarding [PATCH 3/3], is there better way to do it for io_poll_add()? Pavel Begunkov (3): io_uring: pass sqe for link head io_uring: deduce force_nonblock in io_issue_sqe() io_uring: pass submission ref to async fs/io_uring.c | 60 +++++++++++++++++++++++++++++---------------------- 1 file changed, 34 insertions(+), 26 deletions(-) -- 2.24.0