[PATCH 1/5] design: document atomic file mapping exchange log intent structures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Darrick J. Wong <djwong@xxxxxxxxxx>

Document the log intent item formats for the mapping exchange feature.

Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
---
 .../allocation_groups.asciidoc                     |   10 ++
 .../journaling_log.asciidoc                        |  123 ++++++++++++++++++++
 design/XFS_Filesystem_Structure/magic.asciidoc     |    2 
 3 files changed, 135 insertions(+)


diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
index c0ba16a8..e22c7344 100644
--- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
+++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc
@@ -458,6 +458,16 @@ xfs_repair before it can be mounted.
 Large file fork extent counts.  This greatly expands the maximum number of
 space mappings allowed in data and extended attribute file forks.
 
+| +XFS_SB_FEAT_INCOMPAT_EXCHRANGE+ |
+Atomic file mapping exchanges.  The filesystem is capable of exchanging a range
+of mappings between two arbitrary ranges of a file's fork by using log intent
+items to track the progress of the high level exchange operation.  In other
+words, the exchange operation can be restarted if the system goes down, which
+is necessary for userspace to commit of new file contents atomically.  This
+flag has user-visible impacts, which is why it is a permanent incompat flag.
+See the section about xref:XMI_Log_Item[mapping exchange log intents] for more
+information.
+
 |=====
 
 *sb_features_log_incompat*::
diff --git a/design/XFS_Filesystem_Structure/journaling_log.asciidoc b/design/XFS_Filesystem_Structure/journaling_log.asciidoc
index 8ff437fe..9d9fa836 100644
--- a/design/XFS_Filesystem_Structure/journaling_log.asciidoc
+++ b/design/XFS_Filesystem_Structure/journaling_log.asciidoc
@@ -217,6 +217,8 @@ magic number to distinguish themselves.  Buffer data items only appear after
 | +XFS_LI_BUD+			| 0x1245        | xref:BUD_Log_Item[File Block Mapping Update Done]
 | +XFS_LI_ATTRI+		| 0x1246        | xref:ATTRI_Log_Item[Extended Attribute Update Intent]
 | +XFS_LI_ATTRD+		| 0x1247        | xref:ATTRD_Log_Item[Extended Attribute Update Done]
+| +XFS_LI_XMI+			| 0x1248        | xref:XMI_Log_Item[File Mapping Exchange Intent]
+| +XFS_LI_XMD+			| 0x1249        | xref:XMD_Log_Item[File Mapping Exchange Done]
 |=====
 
 Note that all log items (except for transaction headers) MUST start with
@@ -649,6 +651,8 @@ file block mapping operation we want.  The upper three bytes are flag bits.
 | Value				| Description
 | +XFS_BMAP_EXTENT_ATTR_FORK+	| Extent is for the attribute fork.
 | +XFS_BMAP_EXTENT_UNWRITTEN+	| Extent is unwritten.
+| +XFS_BMAP_EXTENT_REALTIME+	| Mapping applies to the data fork of a
+realtime file.  This flag cannot be combined with +XFS_BMAP_EXTENT_ATTR_FORK+.
 |=====
 
 The ``file block mapping update intent'' operation comes first; it tells the
@@ -821,6 +825,125 @@ These regions contain the name and value components of the extended attribute
 being updated, as needed.  There are no magic numbers; each region contains the
 data and nothing else.
 
+[[XMI_Log_Item]]
+=== File Mapping Exchange Intent
+
+These two log items work together to track the exchange of mapped extents
+between the forks of two files.  Each operation requires a separate XMI/XMD
+pair.  The log intent item has the following format:
+
+[source, c]
+----
+struct xfs_xmi_log_format {
+     uint16_t                  xmi_type;
+     uint16_t                  xmi_size;
+     uint32_t                  __pad;
+     uint64_t                  xmi_id;
+     uint64_t                  xmi_inode1;
+     uint64_t                  xmi_inode2;
+     uint32_t                  xmi_igen1;
+     uint32_t                  xmi_igen2;
+     uint64_t                  xmi_startoff1;
+     uint64_t                  xmi_startoff2;
+     uint64_t                  xmi_blockcount;
+     uint64_t                  xmi_flags;
+     int64_t                   xmi_isize1;
+     int64_t                   xmi_isize2;
+};
+----
+
+*xmi_type*::
+The signature of an XMI operation, 0x1248.  This value is in host-endian order,
+not big-endian like the rest of XFS.
+
+*xmi_size*::
+Size of this log item.  Should be 1.
+
+*__pad*::
+Must be zero.
+
+*xmi_id*::
+A 64-bit number that binds the corresponding XMD log item to this XMI log item.
+
+*xmi_inode1*::
+Inode number of the first file involved in the operation.
+
+*xmi_inode2*::
+Inode number of the second file involved in the operation.
+
+*xmi_igen1*::
+Generation number of the first file involved in the operation.
+
+*xmi_igen2*::
+Generation number of the second file involved in the operation.
+
+*xmi_startoff1*::
+Starting point within the first file, in units of filesystem blocks.
+
+*xmi_startoff2*::
+Starting point within the second file, in units of filesystem blocks.
+
+*xmi_blockcount*::
+The length to be exchanged, in units of filesystem blocks.
+
+*xmi_flags*::
+Behavioral changes to the operation, as follows:
+
+.File Extent Swap Intent Item Flags
+[options="header"]
+|=====
+| Value				    | Description
+| +XFS_EXCHMAPS_ATTR_FORK+	    | Exchange extents between attribute forks.
+| +XFS_EXCHMAPS_SET_SIZES+	    | Exchange the file sizes of the two files
+after the operation completes.
+| +XFS_EXCHMAPS_INO1_WRITTEN+	    | Exchange the mappings of two files only
+if the file allocation units mapped to file1's range have been written.
+| +XFS_EXCHMAPS_CLEAR_INO1_REFLINK+ | Clear the reflink flag from inode1 after
+the operation.
+| +XFS_EXCHMAPS_CLEAR_INO2_REFLINK+ | Clear the reflink flag from inode2 after
+the operation.
+|=====
+
+*xmi_isize1*::
+The original size of the first file, in bytes.  This is zero if the
++XFS_EXCHMAPS_SET_SIZES+ flag is not set.
+
+*xmi_isize2*::
+The original size of the second file, in bytes.  This is zero if the
++XFS_EXCHMAPS_SET_SIZES+ flag is not set.
+
+[[XMD_Log_Item]]
+=== Completion of File Mapping Exchange
+
+The ``file mapping exchange done'' operation complements the ``file mapping
+exchange intent'' operation.  This second operation indicates that the update
+actually happened, so that log recovery needn't replay the update.  The XMD
+item and the actual updates are typically found in a new transaction following
+the transaction in which the XMI was logged.  The completion has this format:
+
+[source, c]
+----
+struct xfs_xmd_log_format {
+     uint16_t                  xmd_type;
+     uint16_t                  xmd_size;
+     uint32_t                  __pad;
+     uint64_t                  xmd_xmi_id;
+};
+----
+
+*xmd_type*::
+The signature of an XMD operation, 0x1249.  This value is in host-endian order,
+not big-endian like the rest of XFS.
+
+*xmd_size*::
+Size of this log item.  Should be 1.
+
+*__pad*::
+Must be zero.
+
+*xmd_xmi_id*::
+A 64-bit number that binds the corresponding XMI log item to this XMD log item.
+
 [[Inode_Log_Item]]
 === Inode Updates
 
diff --git a/design/XFS_Filesystem_Structure/magic.asciidoc b/design/XFS_Filesystem_Structure/magic.asciidoc
index a343271a..60952aeb 100644
--- a/design/XFS_Filesystem_Structure/magic.asciidoc
+++ b/design/XFS_Filesystem_Structure/magic.asciidoc
@@ -73,6 +73,8 @@ are not aligned to blocks.
 | +XFS_LI_BUD+			| 0x1245        |       | xref:BUD_Log_Item[File Block Mapping Update Done]
 | +XFS_LI_ATTRI+		| 0x1246        |       | xref:ATTRI_Log_Item[Extended Attribute Update Intent]
 | +XFS_LI_ATTRD+		| 0x1247        |       | xref:ATTRD_Log_Item[Extended Attribute Update Done]
+| +XFS_LI_XMI+			| 0x1248        |       | xref:XMI_Log_Item[File Mapping Exchange Intent]
+| +XFS_LI_XMD+			| 0x1249        |       | xref:XMD_Log_Item[File Mapping Exchange Done]
 |=====
 
 = Theoretical Limits





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux