On 22/07/2020 16:52, Christoph Hellwig wrote: > On Wed, Jul 22, 2020 at 12:43:21PM +0000, Johannes Thumshirn wrote: >> On 21/07/2020 07:54, Christoph Hellwig wrote: >>> On Mon, Jul 20, 2020 at 04:48:50PM +0000, Johannes Thumshirn wrote: >>>> On 20/07/2020 15:45, Christoph Hellwig wrote: >>>>> On Mon, Jul 20, 2020 at 10:21:18PM +0900, Johannes Thumshirn wrote: >>>>>> On a successful completion, the position the data is written to is >>>>>> returned via AIO's res2 field to the calling application. >>>>> >>>>> That is a major, and except for this changelog, undocumented ABI >>>>> change. We had the whole discussion about reporting append results >>>>> in a few threads and the issues with that in io_uring. So let's >>>>> have that discussion there and don't mix it up with how zonefs >>>>> writes data. Without that a lot of the boilerplate code should >>>>> also go away. >>>>> >>>> >>>> OK maybe I didn't remember correctly, but wasn't this all around >>>> io_uring and how we'd report the location back for raw block device >>>> access? >>> >>> Report the write offset. The author seems to be hell bent on making >>> it block device specific, but that is a horrible idea as it is just >>> as useful for normal file systems (or zonefs). >> >> After having looked into io_uring I don't this there is anything that >> prevents io_uring from picking up the write offset from ki_complete's >> res2 argument. As of now io_uring ignores the filed but that can be >> changed. > > Sure. Except for the fact that the io_uring CQE doesn't have space > for it. See the currently ongoing discussion on that.. That one I was aware of, but I thought once that discussion has settled the write offset can be copied from res2 into what ever people have agreed on by then. > >> So the only thing that needs to be done from a zonefs perspective is >> documenting the use of res2 and CC linux-aio and linux-abi (including >> an update of the io_getevents man page). >> >> Or am I completely off track now? > > Yes. We should not have a different ABI just for zonefs. We need to > support this feature in a generic way and not as a weird one off for > one filesystem and only with the legacy AIO interface. OK, will have a look. > Either way please make sure you properly separate the interface ( > using Write vs Zone Append in zonefs) from the interface (returning > the actually written offset from appending writes), as they are quite > separate issues. So doing async RWF_APPEND writes using Zone Append isn't the problem here, it's "only" the reporting of the write offset back to user-space? So once we have sorted this out we can start issuing zone appends for zonefs async writes? Thanks, Johannes