On 24/02/2020 10:12, Andres Freund wrote: > Hi, > > On 2020-02-23 20:52:26 -0700, Jens Axboe wrote: >> The fast case is not being deferred, that's by far the common (and hot) >> case, which means io_issue() is called with sqe != NULL. My worry is >> that by moving it into a prep helper, the compiler isn't smart enough to >> not make that basically two switches. > > I'm not sure that benefit of a single switch isn't offset by the lower > code density due to the additional per-opcode branches. Not inlining > the prepare function results in: > The first looks good, I like the change. Do you have performance numbers? e.g. tools/io_uring/io_uring-bench (do_nop=1, with high DEPTH e.g. 100) would be good enough to estimate relative overhead. I don't expect any difference, TBH. > There's still some unnecessary branching on force_nonblocking. The > second patch just separates the cases needing force_nonblocking > out. Probably not quite the right structure. > It's trickier there. It can get into io_prep_issue_sqe_nonblock() -> io_req_prep() with sqe=NULL. With a glance look, it should crash. The culprit is __io_queue_sqe() with linked requests. Also, io_issue_sqe_nonblock() would look better than io_prep_issue_sqe_nonblock(). BTW, did you tried to run regression tests? It's under liburing repository. > > Not quite sure what the policy is with attaching POC patches? Also send > as separate emails? I'd prefer it inlined (i.e. as text, not attachment), so it can be inline-commented. -- Pavel Begunkov