On Wed, Jun 17, 2020 at 11:56:34PM -0700, Christoph Hellwig wrote:
On Wed, Jun 17, 2020 at 10:53:36PM +0530, Kanchan Joshi wrote:
This patchset enables issuing zone-append using aio and io-uring direct-io interface.
For aio, this introduces opcode IOCB_CMD_ZONE_APPEND. Application uses start LBA
of the zone to issue append. On completion 'res2' field is used to return
zone-relative offset.
For io-uring, this introduces three opcodes: IORING_OP_ZONE_APPEND/APPENDV/APPENDV_FIXED.
Since io_uring does not have aio-like res2, cqe->flags are repurposed to return zone-relative offset
And what exactly are the semantics supposed to be? Remember the
unix file abstractions does not know about zones at all.
I really don't think squeezing low-level not quite block storage
protocol details into the Linux read/write path is a good idea.
I was thinking of raw block-access to zone device rather than pristine file
abstraction. And in that context, semantics, at this point, are unchanged
(i.e. same as direct writes) while flexibility of async-interface gets
added.
Synchronous-writes on single-zone sound fine, but synchronous-appends on
single-zone do not sound that fine.
What could be a useful addition is a way for O_APPEND/RWF_APPEND writes
to report where they actually wrote, as that comes close to Zone Append
while still making sense at our usual abstraction level for file I/O.
Thanks for suggesting this. O and RWF_APPEND may not go well with block
access as end-of-file will be picked from dev inode. But perhaps a new
flag like RWF_ZONE_APPEND can help to transform writes (aio or uring)
into append without introducing new opcodes.
And, I think, this can fit fine on file-abstraction of ZoneFS as well.