> 2020年2月13日 下午11:14,Jens Axboe <axboe@xxxxxxxxx> 写道: > > On 2/13/20 8:08 AM, Pavel Begunkov wrote: >> On 2/13/2020 3:33 AM, Carter Li 李通洲 wrote: >>> Thanks for your reply. >>> >>> You are right the nop isn't really a good test case. But I actually >>> found this issue when benchmarking my echo server, which didn't use >>> NOP of course. >> >> If there are no hidden subtle issues in io_uring, your benchmark or the >> used pattern itself, it's probably because of overhead on async punting >> (copying iovecs, several extra switches, refcounts, grabbing mm/fs/etc, >> io-wq itself). >> >> I was going to tune async/punting stuff anyway, so I'll look into this. >> And of course, there is always a good chance Jens have some bright insights > > The main issue here is that if you do the poll->recv, then it'll be > an automatic punt of the recv to async context when the poll completes. > That's regardless of whether or not we can complete the poll inline, > we never attempt to recv inline from that completion. This is in contrast > to doing a separate poll, getting the notification, then doing another > sqe and io_uring_enter to perform the recv. For this case, we end up > doing everything inline, just with the cost of an additional system call > to submit the new recv. > > It'd be really cool if we could improve on this situation, as recv (or > read) preceded by a poll is indeed a common use case. Or ditto for the > write side. > >> BTW, what's benefit of doing poll(fd)->read(fd), but not directly read()? > > If there's no data to begin with, then the read will go async. Hence > it'll be a switch to a worker thread. The above should avoid it, but > it doesn't. Yes. I actually tested `directly read()` first, and found it was about 30% slower then poll(fd)->read(fd). https://github.com/axboe/liburing/issues/69 So it turns out that async punting has high overhead. A (silly) question: could we implement read/write operations that would block as poll->read/write? > > For carter's sake, it's worth nothing that the poll command is special > and normal requests would be more efficient with links. We just need > to work on making the poll linked with read/write perform much better. Thanks > > -- > Jens Axboe