Re: [PATCH v4 2/3] readv.2: Document RWF_ATOMIC flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17/07/2024 22:44, Darrick J. Wong wrote:
On Wed, Jul 17, 2024 at 09:36:18AM +0000, John Garry wrote:
From: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>

Add RWF_ATOMIC flag description for pwritev2().

Signed-off-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
[jpg: complete rewrite]
Signed-off-by: John Garry <john.g.garry@xxxxxxxxxx>
---
  man/man2/readv.2 | 76 ++++++++++++++++++++++++++++++++++++++++++++++++
  1 file changed, 76 insertions(+)

diff --git a/man/man2/readv.2 b/man/man2/readv.2
index eecde06dc..9c8a11324 100644
--- a/man/man2/readv.2
+++ b/man/man2/readv.2
@@ -193,6 +193,66 @@ which provides lower latency, but may use additional resources.
  .B O_DIRECT
  flag.)
  .TP
+.BR RWF_ATOMIC " (since Linux 6.11)"
+Requires that writes to regular files in block-based filesystems be issued with
+torn-write protection.
+Torn-write protection means that for a power or any other hardware failure,
+all or none of the data from the write will be stored,
+but never a mix of old and new data.
+This flag is meaningful only for
+.BR pwritev2 (),
+and its effect applies only to the data range written by the system call.
+The total write length must be power-of-2 and must be sized in the range
+.RI [ stx_atomic_write_unit_min ,
+.IR stx_atomic_write_unit_max ].
+The write must be at a naturally-aligned offset within the file with respect to
+the total write length -
+for example,

Nit: these could be two sentences

"The write must be at a naturally-aligned offset within the file with
respect to the total write length.  For example, ..."

ok, sure


+a write of length 32KB at a file offset of 32KB is permitted,
+however a write of length 32KB at a file offset of 48KB is not permitted.

Pickier nit: KiB, not KB.

ok


+The upper limit of
+.I iovcnt
+for
+.BR pwritev2 ()
+is in

"is given by" ?

ok, fine, I don't mind


+.I stx_atomic_write_segments_max.
+Torn-write protection only works with
+.B O_DIRECT
+flag, i.e. buffered writes are not supported.
+To guarantee consistency from the write between a file's in-core state with the
+storage device,
+.BR fdatasync (2),
+or
+.BR fsync (2),
+or
+.BR open (2)
+and either
+.B O_SYNC
+or
+.B O_DSYNC,
+or
+.B pwritev2 ()
+and either
+.B RWF_SYNC
+or
+.B RWF_DSYNC
+is required. Flags

This sentence   ^^ should start on a new line.

yes


+.B O_SYNC
+or
+.B RWF_SYNC
+provide the strongest guarantees for
+.BR RWF_ATOMIC,
+in that all data and also file metadata updates will be persisted for a
+successfully completed write.
+Just using either flags
+.B O_DSYNC
+or
+.B RWF_DSYNC
+means that all data and any file updates will be persisted for a successfully
+completed write.


ughh, this is hard to word both concisely and accurately...

"any file updates" ?  I /think/ the difference between O_SYNC and
O_DSYNC is that O_DSYNC persists all data and file metadata updates for
the file range that was written, whereas O_SYNC persists all data and
file metadata updates for the entire file.

I think that https://man7.org/linux/man-pages/man2/open.2.html#NOTES describes it best.


Perhaps everything between "Flags O_SYNC or RWF_SYNC..." and "...for a
successfully completed write." should instead refer readers to the notes
about synchronized I/O flags in the openat manpage?

Maybe that would be better, but we just need to make it clear that RWF_ATOMIC provides the guarantee that the data is atomically updated only in addition to whatever guarantee we have for metadata updates from O_SYNC/O_DSYNC.


So maybe:
RWF_ATOMIC provides the guarantee that any data is written with torn-write protection, and additional flags O_SYNC or O_DSYNC provide
same Synchronized I/O guarantees as documented in <openat manpage reference>

OK?



+Not using any sync flags means that there is no guarantee that data or
+filesystem updates are persisted.
+.TP
  .BR RWF_SYNC " (since Linux 4.7)"
  .\" commit e864f39569f4092c2b2bc72c773b6e486c7e3bd9
  Provide a per-write equivalent of the
@@ -279,10 +339,26 @@ values overflows an
  .I ssize_t
  value.
  .TP
+.B EINVAL
+ For
+.BR RWF_ATOMIC
+set,

"If RWF_ATOMIC is specified..." ?

(to be a bit more consistent with the language around the AT_* flags in
openat)

ok, fine


+the combination of the sum of the
+.I iov_len
+values and the
+.I offset
+value does not comply with the length and offset torn-write protection rules.
+.TP
  .B EINVAL
  The vector count,
  .IR iovcnt ,
  is less than zero or greater than the permitted maximum.
+For
+.BR RWF_ATOMIC
+set, this maximum is in

(same)

--D


Thanks for checking,
John





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux