Re: [PATCH 2/2] zonefs: use zone-append for AIO as well

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020/07/22 21:43, Johannes Thumshirn wrote:
> On 21/07/2020 07:54, Christoph Hellwig wrote:
>> On Mon, Jul 20, 2020 at 04:48:50PM +0000, Johannes Thumshirn wrote:
>>> On 20/07/2020 15:45, Christoph Hellwig wrote:
>>>> On Mon, Jul 20, 2020 at 10:21:18PM +0900, Johannes Thumshirn wrote:
>>>>> On a successful completion, the position the data is written to is
>>>>> returned via AIO's res2 field to the calling application.
>>>>
>>>> That is a major, and except for this changelog, undocumented ABI
>>>> change.  We had the whole discussion about reporting append results
>>>> in a few threads and the issues with that in io_uring.  So let's
>>>> have that discussion there and don't mix it up with how zonefs
>>>> writes data.  Without that a lot of the boilerplate code should
>>>> also go away.
>>>>
>>>
>>> OK maybe I didn't remember correctly, but wasn't this all around 
>>> io_uring and how we'd report the location back for raw block device
>>> access?
>>
>> Report the write offset.  The author seems to be hell bent on making
>> it block device specific, but that is a horrible idea as it is just
>> as useful for normal file systems (or zonefs).
> 
> After having looked into io_uring I don't this there is anything that
> prevents io_uring from picking up the write offset from ki_complete's
> res2 argument. As of now io_uring ignores the filed but that can be 
> changed.
> 
> The reporting of the write offset to user-space still needs to be 
> decided on from an io_uring PoV.
> 
> So the only thing that needs to be done from a zonefs perspective is 
> documenting the use of res2 and CC linux-aio and linux-abi (including
> an update of the io_getevents man page).
> 
> Or am I completely off track now?

That is the general idea. But Christoph point was that reporting the effective
write offset back to user space can be done not only for zone append, but also
for regular FS/files that are open with O_APPEND and being written with AIOs,
legacy or io_uring. Since for this case, the aio->aio_offset field is ignored
and the kiocb pos is initialized with the file size, then incremented with size
for the next AIO, the user never actually sees the actual write offset of its
AIOs. Reporting that back for regular files too can be useful, even though
current application can do without this (or do not use O_APPEND because it is
lacking).

Christoph, please loudly shout at me if I misunderstood you :)

For the regular FS/file case, getting the written file offset is simple. Only
need to use the kiocb->pos. That is not a per FS change.

For the user interface, yes, I agree, res2 is the way to go. And we need to
decide for io_uring how to do it. That is an API change, bacward compatible for
legacy AIO, but still a change. So linux-aio and linux-api lists should be
consulted. Ideally, for io_uring, something backward compatible would be nice
too. Not sure how to do it yet.

Whatever the interface, plugging zonefs into it is the trivial part as you
already did the heavier lifting with writing the async zone append path.


> 
> Thanks,
> 	Johannes
> 


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux