On 7/5/20 3:09 PM, Matthew Wilcox wrote: > On Sun, Jul 05, 2020 at 03:00:47PM -0600, Jens Axboe wrote: >> On 7/5/20 12:47 PM, Kanchan Joshi wrote: >>> From: Selvakumar S <selvakuma.s1@xxxxxxxxxxx> >>> >>> For zone-append, block-layer will return zone-relative offset via ret2 >>> of ki_complete interface. Make changes to collect it, and send to >>> user-space using cqe->flags. >>> >>> Signed-off-by: Selvakumar S <selvakuma.s1@xxxxxxxxxxx> >>> Signed-off-by: Kanchan Joshi <joshi.k@xxxxxxxxxxx> >>> Signed-off-by: Nitesh Shetty <nj.shetty@xxxxxxxxxxx> >>> Signed-off-by: Javier Gonzalez <javier.gonz@xxxxxxxxxxx> >>> --- >>> fs/io_uring.c | 21 +++++++++++++++++++-- >>> 1 file changed, 19 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>> index 155f3d8..cbde4df 100644 >>> --- a/fs/io_uring.c >>> +++ b/fs/io_uring.c >>> @@ -402,6 +402,8 @@ struct io_rw { >>> struct kiocb kiocb; >>> u64 addr; >>> u64 len; >>> + /* zone-relative offset for append, in sectors */ >>> + u32 append_offset; >>> }; >> >> I don't like this very much at all. As it stands, the first cacheline >> of io_kiocb is set aside for request-private data. io_rw is already >> exactly 64 bytes, which means that you're now growing io_rw beyond >> a cacheline and increasing the size of io_kiocb as a whole. >> >> Maybe you can reuse io_rw->len for this, as that is only used on the >> submission side of things. > > I'm surprised you aren't more upset by the abuse of cqe->flags for the > address. Yeah, it's not great either, but we have less leeway there in terms of how much space is available to pass back extra data. > What do you think to my idea of interpreting the user_data as being a > pointer to somewhere to store the address? Obviously other things > can be stored after the address in the user_data. I don't like that at all, as all other commands just pass user_data through. This means the application would have to treat this very differently, and potentially not have a way to store any data for locating the original command on the user side. > Or we could have a separate flag to indicate that is how to interpret > the user_data. I'd be vehemently against changing user_data in any shape or form. It's to be passed through from sqe to cqe, that's how the command flow works. It's never kernel generated, and it's also used as a key for command lookup. -- Jens Axboe