From: Darrick J. Wong <djwong@xxxxxxxxxx> Document the log formats for the atomic extent swapping feature. Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> --- .../allocation_groups.asciidoc | 7 + .../journaling_log.asciidoc | 111 ++++++++++++++++++++ design/XFS_Filesystem_Structure/magic.asciidoc | 2 3 files changed, 120 insertions(+) diff --git a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc index c0ba16a8..7b128838 100644 --- a/design/XFS_Filesystem_Structure/allocation_groups.asciidoc +++ b/design/XFS_Filesystem_Structure/allocation_groups.asciidoc @@ -470,6 +470,13 @@ the FS log if it doesn't understand the flag. | Flag | Description | +XFS_SB_FEAT_INCOMPAT_LOG_XATTRS+ | Extended attribute updates have been committed to the ondisk log. +| +XFS_SB_FEAT_INCOMPAT_LOG_ATOMIC_SWAP+ | +Atomic file content swapping. The filesystem is capable of swapping the +extents mapped to two arbitrary ranges of a file's fork by using intent log +items to track the progress of the high level operation. In other words, a +range swap operation can be restarted if the system goes down, which is +necessary for userspace to commit of new file contents atomically. See the +section about xref:SXI_Log_Item[extent swap log intents] for more information. |===== diff --git a/design/XFS_Filesystem_Structure/journaling_log.asciidoc b/design/XFS_Filesystem_Structure/journaling_log.asciidoc index 8ff437fe..daf9b225 100644 --- a/design/XFS_Filesystem_Structure/journaling_log.asciidoc +++ b/design/XFS_Filesystem_Structure/journaling_log.asciidoc @@ -217,6 +217,8 @@ magic number to distinguish themselves. Buffer data items only appear after | +XFS_LI_BUD+ | 0x1245 | xref:BUD_Log_Item[File Block Mapping Update Done] | +XFS_LI_ATTRI+ | 0x1246 | xref:ATTRI_Log_Item[Extended Attribute Update Intent] | +XFS_LI_ATTRD+ | 0x1247 | xref:ATTRD_Log_Item[Extended Attribute Update Done] +| +XFS_LI_SXI+ | 0x1248 | xref:SXI_Log_Item[File Extent Swap Intent] +| +XFS_LI_SXD+ | 0x1249 | xref:SXD_Log_Item[File Extent Swap Done] |===== Note that all log items (except for transaction headers) MUST start with @@ -649,6 +651,8 @@ file block mapping operation we want. The upper three bytes are flag bits. | Value | Description | +XFS_BMAP_EXTENT_ATTR_FORK+ | Extent is for the attribute fork. | +XFS_BMAP_EXTENT_UNWRITTEN+ | Extent is unwritten. +| +XFS_BMAP_EXTENT_REALTIME+ | Mapping applies to the data fork of a +realtime file. This flag cannot be combined with +XFS_BMAP_EXTENT_ATTR_FORK+. |===== The ``file block mapping update intent'' operation comes first; it tells the @@ -821,6 +825,113 @@ These regions contain the name and value components of the extended attribute being updated, as needed. There are no magic numbers; each region contains the data and nothing else. +[[SXI_Log_Item]] +=== File Extent Swap Intent + +These two log items work together to track the exchange of mapped extents +between the forks of two files. Each operation requires a separate SXI/SXD +pair. The log intent item has the following format: + +[source, c] +---- +struct xfs_sxi_log_format { + uint16_t sxi_type; + uint16_t sxi_size; + uint32_t __pad; + uint64_t sxi_id; + uint64_t sxi_inode1; + uint64_t sxi_inode2; + uint64_t sxi_startoff1; + uint64_t sxi_startoff2; + uint64_t sxi_blockcount; + uint64_t sxi_flags; + int64_t sxi_isize1; + int64_t sxi_isize2; +}; +---- + +*sxi_type*:: +The signature of an SXI operation, 0x1246. This value is in host-endian order, +not big-endian like the rest of XFS. + +*sxi_size*:: +Size of this log item. Should be 1. + +*__pad*:: +Must be zero. + +*sxi_id*:: +A 64-bit number that binds the corresponding SXD log item to this SXI log item. + +*sxi_inode1*:: +Inode number of the first file involved in the operation. + +*sxi_inode2*:: +Inode number of the second file involved in the operation. + +*sxi_startoff1*:: +Starting point within the first file, in units of filesystem blocks. + +*sxi_startoff2*:: +Starting point within the second file, in units of filesystem blocks. + +*sxi_blockcount*:: +The length to be exchanged, in units of filesystem blocks. + +*sxi_flags*:: +Behavioral changes to the operation, as follows: + +.File Extent Swap Intent Item Flags +[options="header"] +|===== +| Value | Description +| +XFS_SWAP_EXTENT_ATTR_FORK+ | Exchange extents between attribute forks. +| +XFS_SWAP_EXTENT_SET_SIZES+ | Exchange the file sizes of the two files +after the operation completes. +| +XFS_SWAP_EXTENT_INO2_SHORTFORM+ | Convert the second file fork back to +inline format after the exchange completes. +|===== + +*sxi_isize1*:: +The original size of the first file, in bytes. This is zero if the ++XFS_SWAP_EXTENT_SET_SIZES+ flag is not set. + +*sxi_isize2*:: +The original size of the second file, in bytes. This is zero if the ++XFS_SWAP_EXTENT_SET_SIZES+ flag is not set. + +[[SXD_Log_Item]] +=== Completion of File Extent Swap + +The ``file extent swap done'' operation complements the ``file extent swap +intent'' operation. This second operation indicates that the update actually +happened, so that log recovery needn't replay the update. The SXD and the +actual updates are typically found in a new transaction following the +transaction in which the SXI was logged. The completion has this format: + +[source, c] +---- +struct xfs_sxd_log_format { + uint16_t sxd_type; + uint16_t sxd_size; + uint32_t __pad; + uint64_t sxd_sxi_id; +}; +---- + +*sxd_type*:: +The signature of an SXD operation, 0x1247. This value is in host-endian order, +not big-endian like the rest of XFS. + +*sxd_size*:: +Size of this log item. Should be 1. + +*__pad*:: +Must be zero. + +*sxd_id*:: +A 64-bit number that binds the corresponding SXI log item to this SXD log item. + [[Inode_Log_Item]] === Inode Updates diff --git a/design/XFS_Filesystem_Structure/magic.asciidoc b/design/XFS_Filesystem_Structure/magic.asciidoc index a343271a..613e50c0 100644 --- a/design/XFS_Filesystem_Structure/magic.asciidoc +++ b/design/XFS_Filesystem_Structure/magic.asciidoc @@ -73,6 +73,8 @@ are not aligned to blocks. | +XFS_LI_BUD+ | 0x1245 | | xref:BUD_Log_Item[File Block Mapping Update Done] | +XFS_LI_ATTRI+ | 0x1246 | | xref:ATTRI_Log_Item[Extended Attribute Update Intent] | +XFS_LI_ATTRD+ | 0x1247 | | xref:ATTRD_Log_Item[Extended Attribute Update Done] +| +XFS_LI_SXI+ | 0x1248 | | xref:SXI_Log_Item[File Extent Swap Intent] +| +XFS_LI_SXD+ | 0x1249 | | xref:SXD_Log_Item[File Extent Swap Done] |===== = Theoretical Limits