Re: [PATCH v4 2/3] readv.2: Document RWF_ATOMIC flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 18, 2024 at 03:07:59PM +0100, John Garry wrote:
> On 17/07/2024 22:44, Darrick J. Wong wrote:
> > On Wed, Jul 17, 2024 at 09:36:18AM +0000, John Garry wrote:
> > > From: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> > > 
> > > Add RWF_ATOMIC flag description for pwritev2().
> > > 
> > > Signed-off-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> > > [jpg: complete rewrite]
> > > Signed-off-by: John Garry <john.g.garry@xxxxxxxxxx>
> > > ---
> > >   man/man2/readv.2 | 76 ++++++++++++++++++++++++++++++++++++++++++++++++
> > >   1 file changed, 76 insertions(+)
> > > 
> > > diff --git a/man/man2/readv.2 b/man/man2/readv.2
> > > index eecde06dc..9c8a11324 100644
> > > --- a/man/man2/readv.2
> > > +++ b/man/man2/readv.2
> > > @@ -193,6 +193,66 @@ which provides lower latency, but may use additional resources.
> > >   .B O_DIRECT
> > >   flag.)
> > >   .TP
> > > +.BR RWF_ATOMIC " (since Linux 6.11)"
> > > +Requires that writes to regular files in block-based filesystems be issued with
> > > +torn-write protection.
> > > +Torn-write protection means that for a power or any other hardware failure,
> > > +all or none of the data from the write will be stored,
> > > +but never a mix of old and new data.
> > > +This flag is meaningful only for
> > > +.BR pwritev2 (),
> > > +and its effect applies only to the data range written by the system call.
> > > +The total write length must be power-of-2 and must be sized in the range
> > > +.RI [ stx_atomic_write_unit_min ,
> > > +.IR stx_atomic_write_unit_max ].
> > > +The write must be at a naturally-aligned offset within the file with respect to
> > > +the total write length -
> > > +for example,
> > 
> > Nit: these could be two sentences
> > 
> > "The write must be at a naturally-aligned offset within the file with
> > respect to the total write length.  For example, ..."
> 
> ok, sure
> 
> > 
> > > +a write of length 32KB at a file offset of 32KB is permitted,
> > > +however a write of length 32KB at a file offset of 48KB is not permitted.
> > 
> > Pickier nit: KiB, not KB.
> 
> ok
> 
> > 
> > > +The upper limit of
> > > +.I iovcnt
> > > +for
> > > +.BR pwritev2 ()
> > > +is in
> > 
> > "is given by" ?
> 
> ok, fine, I don't mind
> 
> > 
> > > +.I stx_atomic_write_segments_max.
> > > +Torn-write protection only works with
> > > +.B O_DIRECT
> > > +flag, i.e. buffered writes are not supported.
> > > +To guarantee consistency from the write between a file's in-core state with the
> > > +storage device,
> > > +.BR fdatasync (2),
> > > +or
> > > +.BR fsync (2),
> > > +or
> > > +.BR open (2)
> > > +and either
> > > +.B O_SYNC
> > > +or
> > > +.B O_DSYNC,
> > > +or
> > > +.B pwritev2 ()
> > > +and either
> > > +.B RWF_SYNC
> > > +or
> > > +.B RWF_DSYNC
> > > +is required. Flags
> > 
> > This sentence   ^^ should start on a new line.
> 
> yes
> 
> > 
> > > +.B O_SYNC
> > > +or
> > > +.B RWF_SYNC
> > > +provide the strongest guarantees for
> > > +.BR RWF_ATOMIC,
> > > +in that all data and also file metadata updates will be persisted for a
> > > +successfully completed write.
> > > +Just using either flags
> > > +.B O_DSYNC
> > > +or
> > > +.B RWF_DSYNC
> > > +means that all data and any file updates will be persisted for a successfully
> > > +completed write.
> > 
> 
> ughh, this is hard to word both concisely and accurately...
> 
> > "any file updates" ?  I /think/ the difference between O_SYNC and
> > O_DSYNC is that O_DSYNC persists all data and file metadata updates for
> > the file range that was written, whereas O_SYNC persists all data and
> > file metadata updates for the entire file.
> 
> I think that https://man7.org/linux/man-pages/man2/open.2.html#NOTES
> describes it best.
> 
> > 
> > Perhaps everything between "Flags O_SYNC or RWF_SYNC..." and "...for a
> > successfully completed write." should instead refer readers to the notes
> > about synchronized I/O flags in the openat manpage?
> 
> Maybe that would be better, but we just need to make it clear that
> RWF_ATOMIC provides the guarantee that the data is atomically updated only
> in addition to whatever guarantee we have for metadata updates from
> O_SYNC/O_DSYNC.
> 
> 
> So maybe:
> RWF_ATOMIC provides the guarantee that any data is written with torn-write
> protection, and additional flags O_SYNC or O_DSYNC provide
> same Synchronized I/O guarantees as documented in <openat manpage reference>

  ^ the same

> 
> OK?

Yes.

> > > +Not using any sync flags means that there is no guarantee that data or
> > > +filesystem updates are persisted.
> > > +.TP
> > >   .BR RWF_SYNC " (since Linux 4.7)"
> > >   .\" commit e864f39569f4092c2b2bc72c773b6e486c7e3bd9
> > >   Provide a per-write equivalent of the
> > > @@ -279,10 +339,26 @@ values overflows an
> > >   .I ssize_t
> > >   value.
> > >   .TP
> > > +.B EINVAL
> > > + For
> > > +.BR RWF_ATOMIC
> > > +set,
> > 
> > "If RWF_ATOMIC is specified..." ?
> > 
> > (to be a bit more consistent with the language around the AT_* flags in
> > openat)
> 
> ok, fine
> 
> > 
> > > +the combination of the sum of the
> > > +.I iov_len
> > > +values and the
> > > +.I offset
> > > +value does not comply with the length and offset torn-write protection rules.
> > > +.TP
> > >   .B EINVAL
> > >   The vector count,
> > >   .IR iovcnt ,
> > >   is less than zero or greater than the permitted maximum.
> > > +For
> > > +.BR RWF_ATOMIC
> > > +set, this maximum is in
> > 
> > (same)
> > 
> > --D
> > 
> 
> Thanks for checking,

NP. :)

--D

> John
> 
> 




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux