On 3/6/19 9:13 AM, Jens Axboe wrote: > Hi Linus, > > 2nd attempt at adding the io_uring interface. Since the first one, > we've added basic unit testing of the three system calls, that > resides in liburing like the other unit tests that we have so far. > It'll take a while to get full coverage of it, but we're working > towards it. I've also added two basic test programs to tools/io_uring. > One uses the raw interface and has support for all the various > features that io_uring supports outside of standard IO, like fixed > files, fixed IO buffers, and polled IO. The other uses the liburing > API, and is a simplified version of cp(1). > > This pull request adds support for a new IO interface, io_uring. > io_uring allows an application to communicate with the kernel through > two rings, the submission queue (SQ) and completion queue (CQ) ring. > This allows for very efficient handling of IOs, see the v5 posting for > some basic numbers: > > https://lore.kernel.org/linux-block/20190116175003.17880-1-axboe@xxxxxxxxx/ > > Outside of just efficiency, the interface is also flexible and > extendable, and allows for future use cases like the upcoming NVMe > key-value store API, networked IO, and so on. It also supports async > buffered IO, something that we've always failed to support in the > kernel. > > Outside of basic IO features, it supports async polled IO as well. This > particular feature has already been tested at Facebook months ago for > flash storage boxes, with 25-33% improvements. It makes polled IO > actually useful for real world use cases, where even basic flash sees a > nice win in terms of efficiency, latency, and performance. These boxes > were IOPS bound before, now they are not. > > This series adds three new system calls. One for setting up an io_uring > instance (io_uring_setup(2)), one for submitting/completing IO > (io_uring_enter(2)), and one for aux functions like registrating file > sets, buffers, etc (io_uring_register(2)). Through the help of Arnd, > I've coordinated the syscall numbers so merge on that front should be > painless. > > Jon did a writeup of the interface a while back, which (except for minor > details that have been tweaked) is still accurate. Find that here: > > https://lwn.net/Articles/776703/ > > Huge thanks to Al Viro for helping getting the reference cycle code > correct, and to Jann Horn for his extensive reviews focused on both > security and bugs in general. > > There's a userspace library that provides basic functionality for > applications that don't need or want to care about how to fiddle with > the rings directly. It has helpers to allow applications to easily set > up an io_uring instance, and submit/complete IO through it without > knowing about the intricacies of the rings. It also includes man pages > (thanks to Jeff Moyer), and will continue to grow support helper > functions and features as time progresses. Find it here: > > git://git.kernel.dk/liburing > > Fio has full support for the raw interface, both in the form of an IO > engine (io_uring), but also with a small test application (t/io_uring) > that can exercise and benchmark the interface. > > Note that this branch sits on top of my for-5.1/block branch, since the > multi-page bvec changes caused a few conflicts with the pre-mapped > buffer support. I also moved a few prep patches to that branch today, > which is why it appears recently rebased (moved the 4 bottom patches > from io_uring to for-5.1/block). > > Please consider this feature for 5.1, so we can finally have something > that's both fast, efficient, and feature rich for IO instead of the sad > niche case that is aio/libaio. > > > git://git.kernel.dk/linux-block.git tags/io_uring-2019-03-06 Slight mess up in the stats, here's the correct one... Note that this also throws a few more merge conflicts now, due to the syscall merges. All trivial, though, and the branch was prepared for it in terms of numbering. ---------------------------------------------------------------- Christoph Hellwig (1): io_uring: add fsync support Jens Axboe (14): Add io_uring IO interface io_uring: support for IO polling fs: add fget_many() and fput_many() io_uring: use fget/fput_many() for file references io_uring: batch io_kiocb allocation block: implement bio helper to add iter bvec pages to bio io_uring: add support for pre-mapped user IO buffers net: split out functions related to registering inflight socket files io_uring: add file set registration io_uring: add submission polling io_uring: add io_kiocb ref count io_uring: add support for IORING_OP_POLL io_uring: allow workqueue item to handle multiple buffered requests io_uring: add a few test tools arch/x86/entry/syscalls/syscall_32.tbl | 3 + arch/x86/entry/syscalls/syscall_64.tbl | 3 + block/bio.c | 62 +- fs/Makefile | 1 + fs/file.c | 15 +- fs/file_table.c | 9 +- fs/io_uring.c | 2971 ++++++++++++++++++++++++++++++++ include/linux/file.h | 2 + include/linux/fs.h | 13 +- include/linux/sched/user.h | 2 +- include/linux/syscalls.h | 8 + include/net/af_unix.h | 1 + include/uapi/asm-generic/unistd.h | 8 +- include/uapi/linux/io_uring.h | 137 ++ init/Kconfig | 9 + kernel/sys_ni.c | 3 + net/Makefile | 2 +- net/unix/Kconfig | 5 + net/unix/Makefile | 2 + net/unix/af_unix.c | 63 +- net/unix/garbage.c | 68 +- net/unix/scm.c | 151 ++ net/unix/scm.h | 10 + tools/io_uring/Makefile | 18 + tools/io_uring/README | 29 + tools/io_uring/barrier.h | 16 + tools/io_uring/io_uring-bench.c | 616 +++++++ tools/io_uring/io_uring-cp.c | 251 +++ tools/io_uring/liburing.h | 143 ++ tools/io_uring/queue.c | 164 ++ tools/io_uring/setup.c | 103 ++ tools/io_uring/syscall.c | 40 + 32 files changed, 4782 insertions(+), 146 deletions(-) create mode 100644 fs/io_uring.c create mode 100644 include/uapi/linux/io_uring.h create mode 100644 net/unix/scm.c create mode 100644 net/unix/scm.h create mode 100644 tools/io_uring/Makefile create mode 100644 tools/io_uring/README create mode 100644 tools/io_uring/barrier.h create mode 100644 tools/io_uring/io_uring-bench.c create mode 100644 tools/io_uring/io_uring-cp.c create mode 100644 tools/io_uring/liburing.h create mode 100644 tools/io_uring/queue.c create mode 100644 tools/io_uring/setup.c create mode 100644 tools/io_uring/syscall.c -- Jens Axboe