Hi Linus, 2nd attempt at adding the io_uring interface. Since the first one, we've added basic unit testing of the three system calls, that resides in liburing like the other unit tests that we have so far. It'll take a while to get full coverage of it, but we're working towards it. I've also added two basic test programs to tools/io_uring. One uses the raw interface and has support for all the various features that io_uring supports outside of standard IO, like fixed files, fixed IO buffers, and polled IO. The other uses the liburing API, and is a simplified version of cp(1). This pull request adds support for a new IO interface, io_uring. io_uring allows an application to communicate with the kernel through two rings, the submission queue (SQ) and completion queue (CQ) ring. This allows for very efficient handling of IOs, see the v5 posting for some basic numbers: https://lore.kernel.org/linux-block/20190116175003.17880-1-axboe@xxxxxxxxx/ Outside of just efficiency, the interface is also flexible and extendable, and allows for future use cases like the upcoming NVMe key-value store API, networked IO, and so on. It also supports async buffered IO, something that we've always failed to support in the kernel. Outside of basic IO features, it supports async polled IO as well. This particular feature has already been tested at Facebook months ago for flash storage boxes, with 25-33% improvements. It makes polled IO actually useful for real world use cases, where even basic flash sees a nice win in terms of efficiency, latency, and performance. These boxes were IOPS bound before, now they are not. This series adds three new system calls. One for setting up an io_uring instance (io_uring_setup(2)), one for submitting/completing IO (io_uring_enter(2)), and one for aux functions like registrating file sets, buffers, etc (io_uring_register(2)). Through the help of Arnd, I've coordinated the syscall numbers so merge on that front should be painless. Jon did a writeup of the interface a while back, which (except for minor details that have been tweaked) is still accurate. Find that here: https://lwn.net/Articles/776703/ Huge thanks to Al Viro for helping getting the reference cycle code correct, and to Jann Horn for his extensive reviews focused on both security and bugs in general. There's a userspace library that provides basic functionality for applications that don't need or want to care about how to fiddle with the rings directly. It has helpers to allow applications to easily set up an io_uring instance, and submit/complete IO through it without knowing about the intricacies of the rings. It also includes man pages (thanks to Jeff Moyer), and will continue to grow support helper functions and features as time progresses. Find it here: git://git.kernel.dk/liburing Fio has full support for the raw interface, both in the form of an IO engine (io_uring), but also with a small test application (t/io_uring) that can exercise and benchmark the interface. Note that this branch sits on top of my for-5.1/block branch, since the multi-page bvec changes caused a few conflicts with the pre-mapped buffer support. I also moved a few prep patches to that branch today, which is why it appears recently rebased (moved the 4 bottom patches from io_uring to for-5.1/block). Please consider this feature for 5.1, so we can finally have something that's both fast, efficient, and feature rich for IO instead of the sad niche case that is aio/libaio. git://git.kernel.dk/linux-block.git tags/io_uring-2019-03-06 ---------------------------------------------------------------- Christoph Hellwig (1): io_uring: add fsync support Jens Axboe (14): Add io_uring IO interface io_uring: support for IO polling fs: add fget_many() and fput_many() io_uring: use fget/fput_many() for file references io_uring: batch io_kiocb allocation block: implement bio helper to add iter bvec pages to bio io_uring: add support for pre-mapped user IO buffers net: split out functions related to registering inflight socket files io_uring: add file set registration io_uring: add submission polling io_uring: add io_kiocb ref count io_uring: add support for IORING_OP_POLL io_uring: allow workqueue item to handle multiple buffered requests io_uring: add a few test tools arch/x86/entry/syscalls/syscall_32.tbl | 3 + arch/x86/entry/syscalls/syscall_64.tbl | 3 + block/bio.c | 62 +- fs/Makefile | 1 + fs/aio.c | 6 +- fs/file.c | 15 +- fs/file_table.c | 9 +- fs/io_uring.c | 2969 ++++++++++++++++++++++++++++++++ include/linux/file.h | 2 + include/linux/fs.h | 13 +- include/linux/sched/user.h | 2 +- include/linux/syscalls.h | 8 + include/net/af_unix.h | 1 + include/uapi/asm-generic/unistd.h | 8 +- include/uapi/linux/io_uring.h | 137 ++ init/Kconfig | 9 + kernel/sys_ni.c | 3 + net/Makefile | 2 +- net/unix/Kconfig | 5 + net/unix/Makefile | 2 + net/unix/af_unix.c | 63 +- net/unix/garbage.c | 68 +- net/unix/scm.c | 151 ++ net/unix/scm.h | 10 + tools/io_uring/Makefile | 18 + tools/io_uring/README | 29 + tools/io_uring/barrier.h | 16 + tools/io_uring/io_uring-bench.c | 616 +++++++ tools/io_uring/io_uring-cp.c | 251 +++ tools/io_uring/liburing.h | 143 ++ tools/io_uring/queue.c | 164 ++ tools/io_uring/setup.c | 103 ++ tools/io_uring/syscall.c | 40 + 33 files changed, 4784 insertions(+), 148 deletions(-) create mode 100644 fs/io_uring.c create mode 100644 include/uapi/linux/io_uring.h create mode 100644 net/unix/scm.c create mode 100644 net/unix/scm.h create mode 100644 tools/io_uring/Makefile create mode 100644 tools/io_uring/README create mode 100644 tools/io_uring/barrier.h create mode 100644 tools/io_uring/io_uring-bench.c create mode 100644 tools/io_uring/io_uring-cp.c create mode 100644 tools/io_uring/liburing.h create mode 100644 tools/io_uring/queue.c create mode 100644 tools/io_uring/setup.c create mode 100644 tools/io_uring/syscall.c -- Jens Axboe