From: Kanchan Joshi <joshi.k@xxxxxxxxxxx> With F_SET_RW_HINT fcntl, user can set a hint on the file inode, and all the subsequent writes on the file pass that hint value down. This can be limiting for block device as all the writes will be tagged with only one lifetime hint value. Concurrent writes (with different hint values) are hard to manage. Per-IO hinting solves that problem. Allow userspace to pass additional metadata in the SQE. __u16 write_hint; If the hint is provided, filesystems may optionally use it. A filesytem may ignore this field if it does not support per-io hints, or if the value is invalid for its backing storage. Just like the inode hints, requesting values that are not supported by the hardware are not an error. Signed-off-by: Kanchan Joshi <joshi.k@xxxxxxxxxxx> Signed-off-by: Nitesh Shetty <nj.shetty@xxxxxxxxxxx> Signed-off-by: Keith Busch <kbusch@xxxxxxxxxx> --- include/uapi/linux/io_uring.h | 4 ++++ io_uring/io_uring.c | 2 ++ io_uring/rw.c | 2 +- 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 56cf30b49ef5f..4a6c95c923eb4 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -98,6 +98,10 @@ struct io_uring_sqe { __u64 addr3; __u64 __pad2[1]; }; + struct { + __u64 __pad4[1]; + __u16 write_hint; + }; __u64 optval; /* * If the ring is initialized with IORING_SETUP_SQE128, then diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 076171977d5e3..115af82b9151f 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -4169,6 +4169,8 @@ static int __init io_uring_init(void) BUILD_BUG_SQE_ELEM(46, __u16, __pad3[0]); BUILD_BUG_SQE_ELEM(48, __u64, addr3); BUILD_BUG_SQE_ELEM_SIZE(48, 0, cmd); + BUILD_BUG_SQE_ELEM(48, __u64, __pad4); + BUILD_BUG_SQE_ELEM(56, __u16, write_hint); BUILD_BUG_SQE_ELEM(56, __u64, __pad2); BUILD_BUG_ON(sizeof(struct io_uring_files_update) != diff --git a/io_uring/rw.c b/io_uring/rw.c index 93526a64ccd60..fdab23424f386 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -279,7 +279,7 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, rw->kiocb.ki_ioprio = get_current_ioprio(); } rw->kiocb.dio_complete = NULL; - + rw->kiocb.ki_write_hint = READ_ONCE(sqe->write_hint); rw->addr = READ_ONCE(sqe->addr); rw->len = READ_ONCE(sqe->len); rw->flags = READ_ONCE(sqe->rw_flags); -- 2.43.5