On 2/8/19 6:34 PM, Jens Axboe wrote:
This enables an application to do IO, without ever entering the kernel. By using the SQ ring to fill in new sqes and watching for completions on the CQ ring, we can submit and reap IOs without doing a single system call. The kernel side thread will poll for new submissions, and in case of HIPRI/polled IO, it'll also poll for completions. By default, we allow 1 second of active spinning. This can by changed by passing in a different grace period at io_uring_register(2) time. If the thread exceeds this idle time without having any work to do, it will set: sq_ring->flags |= IORING_SQ_NEED_WAKEUP. The application will have to call io_uring_enter() to start things back up again. If IO is kept busy, that will never be needed. Basically an application that has this feature enabled will guard it's io_uring_enter(2) call with: read_barrier(); if (*sq_ring->flags & IORING_SQ_NEED_WAKEUP) io_uring_enter(fd, 0, 0, IORING_ENTER_SQ_WAKEUP); instead of calling it unconditionally. It's mandatory to use fixed files with this feature. Failure to do so will result in the application getting an -EBADF CQ entry when submitting IO. Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> --- fs/io_uring.c | 249 +++++++++++++++++++++++++++++++++- include/uapi/linux/io_uring.h | 12 +- 2 files changed, 253 insertions(+), 8 deletions(-)
Reviewed-by: Hannes Reinecke <hare@xxxxxxxx> Cheers, Hannes