On Thu, Jul 18, 2024 at 03:07:59PM +0100, John Garry wrote: > On 17/07/2024 22:44, Darrick J. Wong wrote: > > On Wed, Jul 17, 2024 at 09:36:18AM +0000, John Garry wrote: > > > From: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx> > > > > > > Add RWF_ATOMIC flag description for pwritev2(). > > > > > > Signed-off-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx> > > > [jpg: complete rewrite] > > > Signed-off-by: John Garry <john.g.garry@xxxxxxxxxx> > > > --- > > > man/man2/readv.2 | 76 ++++++++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 76 insertions(+) > > > > > > diff --git a/man/man2/readv.2 b/man/man2/readv.2 > > > index eecde06dc..9c8a11324 100644 > > > --- a/man/man2/readv.2 > > > +++ b/man/man2/readv.2 > > > @@ -193,6 +193,66 @@ which provides lower latency, but may use additional resources. > > > .B O_DIRECT > > > flag.) > > > .TP > > > +.BR RWF_ATOMIC " (since Linux 6.11)" > > > +Requires that writes to regular files in block-based filesystems be issued with > > > +torn-write protection. > > > +Torn-write protection means that for a power or any other hardware failure, > > > +all or none of the data from the write will be stored, > > > +but never a mix of old and new data. > > > +This flag is meaningful only for > > > +.BR pwritev2 (), > > > +and its effect applies only to the data range written by the system call. > > > +The total write length must be power-of-2 and must be sized in the range > > > +.RI [ stx_atomic_write_unit_min , > > > +.IR stx_atomic_write_unit_max ]. > > > +The write must be at a naturally-aligned offset within the file with respect to > > > +the total write length - > > > +for example, > > > > Nit: these could be two sentences > > > > "The write must be at a naturally-aligned offset within the file with > > respect to the total write length. For example, ..." > > ok, sure > > > > > > +a write of length 32KB at a file offset of 32KB is permitted, > > > +however a write of length 32KB at a file offset of 48KB is not permitted. > > > > Pickier nit: KiB, not KB. > > ok > > > > > > +The upper limit of > > > +.I iovcnt > > > +for > > > +.BR pwritev2 () > > > +is in > > > > "is given by" ? > > ok, fine, I don't mind > > > > > > +.I stx_atomic_write_segments_max. > > > +Torn-write protection only works with > > > +.B O_DIRECT > > > +flag, i.e. buffered writes are not supported. > > > +To guarantee consistency from the write between a file's in-core state with the > > > +storage device, > > > +.BR fdatasync (2), > > > +or > > > +.BR fsync (2), > > > +or > > > +.BR open (2) > > > +and either > > > +.B O_SYNC > > > +or > > > +.B O_DSYNC, > > > +or > > > +.B pwritev2 () > > > +and either > > > +.B RWF_SYNC > > > +or > > > +.B RWF_DSYNC > > > +is required. Flags > > > > This sentence ^^ should start on a new line. > > yes > > > > > > +.B O_SYNC > > > +or > > > +.B RWF_SYNC > > > +provide the strongest guarantees for > > > +.BR RWF_ATOMIC, > > > +in that all data and also file metadata updates will be persisted for a > > > +successfully completed write. > > > +Just using either flags > > > +.B O_DSYNC > > > +or > > > +.B RWF_DSYNC > > > +means that all data and any file updates will be persisted for a successfully > > > +completed write. > > > > ughh, this is hard to word both concisely and accurately... > > > "any file updates" ? I /think/ the difference between O_SYNC and > > O_DSYNC is that O_DSYNC persists all data and file metadata updates for > > the file range that was written, whereas O_SYNC persists all data and > > file metadata updates for the entire file. > > I think that https://man7.org/linux/man-pages/man2/open.2.html#NOTES > describes it best. > > > > > Perhaps everything between "Flags O_SYNC or RWF_SYNC..." and "...for a > > successfully completed write." should instead refer readers to the notes > > about synchronized I/O flags in the openat manpage? > > Maybe that would be better, but we just need to make it clear that > RWF_ATOMIC provides the guarantee that the data is atomically updated only > in addition to whatever guarantee we have for metadata updates from > O_SYNC/O_DSYNC. > > > So maybe: > RWF_ATOMIC provides the guarantee that any data is written with torn-write > protection, and additional flags O_SYNC or O_DSYNC provide > same Synchronized I/O guarantees as documented in <openat manpage reference> ^ the same > > OK? Yes. > > > +Not using any sync flags means that there is no guarantee that data or > > > +filesystem updates are persisted. > > > +.TP > > > .BR RWF_SYNC " (since Linux 4.7)" > > > .\" commit e864f39569f4092c2b2bc72c773b6e486c7e3bd9 > > > Provide a per-write equivalent of the > > > @@ -279,10 +339,26 @@ values overflows an > > > .I ssize_t > > > value. > > > .TP > > > +.B EINVAL > > > + For > > > +.BR RWF_ATOMIC > > > +set, > > > > "If RWF_ATOMIC is specified..." ? > > > > (to be a bit more consistent with the language around the AT_* flags in > > openat) > > ok, fine > > > > > > +the combination of the sum of the > > > +.I iov_len > > > +values and the > > > +.I offset > > > +value does not comply with the length and offset torn-write protection rules. > > > +.TP > > > .B EINVAL > > > The vector count, > > > .IR iovcnt , > > > is less than zero or greater than the permitted maximum. > > > +For > > > +.BR RWF_ATOMIC > > > +set, this maximum is in > > > > (same) > > > > --D > > > > Thanks for checking, NP. :) --D > John > >