Re: [PATCH v3 4/4] io_uring: add support for zone-append

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/5/20 3:09 PM, Matthew Wilcox wrote:
> On Sun, Jul 05, 2020 at 03:00:47PM -0600, Jens Axboe wrote:
>> On 7/5/20 12:47 PM, Kanchan Joshi wrote:
>>> From: Selvakumar S <selvakuma.s1@xxxxxxxxxxx>
>>>
>>> For zone-append, block-layer will return zone-relative offset via ret2
>>> of ki_complete interface. Make changes to collect it, and send to
>>> user-space using cqe->flags.
>>>
>>> Signed-off-by: Selvakumar S <selvakuma.s1@xxxxxxxxxxx>
>>> Signed-off-by: Kanchan Joshi <joshi.k@xxxxxxxxxxx>
>>> Signed-off-by: Nitesh Shetty <nj.shetty@xxxxxxxxxxx>
>>> Signed-off-by: Javier Gonzalez <javier.gonz@xxxxxxxxxxx>
>>> ---
>>>  fs/io_uring.c | 21 +++++++++++++++++++--
>>>  1 file changed, 19 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>> index 155f3d8..cbde4df 100644
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -402,6 +402,8 @@ struct io_rw {
>>>  	struct kiocb			kiocb;
>>>  	u64				addr;
>>>  	u64				len;
>>> +	/* zone-relative offset for append, in sectors */
>>> +	u32			append_offset;
>>>  };
>>
>> I don't like this very much at all. As it stands, the first cacheline
>> of io_kiocb is set aside for request-private data. io_rw is already
>> exactly 64 bytes, which means that you're now growing io_rw beyond
>> a cacheline and increasing the size of io_kiocb as a whole.
>>
>> Maybe you can reuse io_rw->len for this, as that is only used on the
>> submission side of things.
> 
> I'm surprised you aren't more upset by the abuse of cqe->flags for the
> address.

Yeah, it's not great either, but we have less leeway there in terms of
how much space is available to pass back extra data.

> What do you think to my idea of interpreting the user_data as being a
> pointer to somewhere to store the address?  Obviously other things
> can be stored after the address in the user_data.

I don't like that at all, as all other commands just pass user_data
through. This means the application would have to treat this very
differently, and potentially not have a way to store any data for
locating the original command on the user side.

> Or we could have a separate flag to indicate that is how to interpret
> the user_data.

I'd be vehemently against changing user_data in any shape or form.
It's to be passed through from sqe to cqe, that's how the command flow
works. It's never kernel generated, and it's also used as a key for
command lookup.

-- 
Jens Axboe




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux