On 2/12/20 9:31 AM, Carter Li 李通洲 wrote: > Hi everyone, > > IOSQE_IO_LINK seems to have very high cost, even greater then io_uring_enter syscall. > > Test code attached below. The program completes after getting 100000000 cqes. > > $ gcc test.c -luring -o test0 -g -O3 -DUSE_LINK=0 > $ time ./test0 > USE_LINK: 0, count: 100000000, submit_count: 1562500 > 0.99user 9.99system 0:11.02elapsed 99%CPU (0avgtext+0avgdata 1608maxresident)k > 0inputs+0outputs (0major+72minor)pagefaults 0swaps > > $ gcc test.c -luring -o test1 -g -O3 -DUSE_LINK=1 > $ time ./test1 > USE_LINK: 1, count: 100000110, submit_count: 799584 > 0.83user 19.21system 0:20.90elapsed 95%CPU (0avgtext+0avgdata 1632maxresident)k > 0inputs+0outputs (0major+72minor)pagefaults 0swaps > > As you can see, the `-DUSE_LINK=1` version emits only about half io_uring_submit calls > of the other version, but takes twice as long. That makes IOSQE_IO_LINK almost useless, > please have a check. The nop isn't really a good test case, as it doesn't contain any smarts in terms of executing a link fast. So it doesn't say a whole lot outside of "we could make nop links faster", which is also kind of pointless. "Normal" commands will work better. Where the link is really a win is if the first request needs to go async to complete. For that case, the next link can execute directly from that context. This saves an async punt for the common case. -- Jens Axboe