[PATCHSET v24.0 00/19] libxfs: atomic file updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

This series creates a new FIEXCHANGE_RANGE system call to exchange
ranges of bytes between two files atomically.  This new functionality
enables data storage programs to stage and commit file updates such that
reader programs will see either the old contents or the new contents in
their entirety, with no chance of torn writes.  A successful call
completion guarantees that the new contents will be seen even if the
system fails.

The ability to swap extent mappings between files in this manner is
critical to supporting online filesystem repair, which is built upon the
strategy of constructing a clean copy of a damaged structure and
committing the new structure into the metadata file atomically.

User programs will be able to update files atomically by opening an
O_TMPFILE, reflinking the source file to it, making whatever updates
they want to make, and exchange the relevant ranges of the temp file
with the original file.  If the updates are aligned with the file block
size, a new (since v2) flag provides for exchanging only the written
areas.  Callers can arrange for the update to be rejected if the
original file has been changed.

The intent behind this new userspace functionality is to enable atomic
rewrites of arbitrary parts of individual files.  For years, application
programmers wanting to ensure the atomicity of a file update had to
write the changes to a new file in the same directory, fsync the new
file, rename the new file on top of the old filename, and then fsync the
directory.  People get it wrong all the time, and $fs hacks abound.

The reference implementation in XFS creates a new log incompat feature
and log intent items to track high level progress of swapping ranges of
two files and finish interrupted work if the system goes down.  Sample
code can be found in the corresponding changes to xfs_io to exercise the
use case mentioned above.

Note that this function is /not/ the O_DIRECT atomic file writes concept
that has also been floating around for years.  This RFC is constructed
entirely in software, which means that there are no limitations other
than the general filesystem limits.

As a side note, the original motivation behind the kernel functionality
is online repair of file-based metadata.  The atomic file swap is
implemented as an atomic inode fork swap, which means that we can
implement online reconstruction of extended attributes and directories
by building a new one in another inode and atomically swap the contents.

Subsequent patchsets adapt the online filesystem repair code to use
atomic extent swapping.  This enables repair functions to construct a
clean copy of a directory, xattr information, symbolic links, realtime
bitmaps, and realtime summary information in a temporary inode.  If this
completes successfully, the new contents can be swapped atomically into
the inode being repaired.  This is essential to avoid making corruption
problems worse if the system goes down in the middle of running repair.

This patchset also ports the old XFS extent swap ioctl interface to use
the new extent swap code.

For userspace, this series also includes the userspace pieces needed to
test the new functionality, and a sample implementation of atomic file
updates.

Question: Should we really bother with fsdevel bikeshedding?  Most
filesystems cannot support this functionality, so we could keep it
private to XFS for now.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=atomic-file-updates

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=atomic-file-updates

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=atomic-file-updates

xfsdocs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-documentation.git/log/?h=atomic-file-updates
---
 configure.ac                    |    1 
 fsr/xfs_fsr.c                   |  214 ++++---
 include/builddefs.in            |    1 
 include/jdm.h                   |   24 +
 include/xfs_inode.h             |    5 
 include/xfs_trace.h             |   13 
 io/Makefile                     |    6 
 io/atomicupdate.c               |  387 ++++++++++++
 io/init.c                       |    1 
 io/inject.c                     |    1 
 io/io.h                         |    5 
 io/open.c                       |   27 +
 io/swapext.c                    |  195 +++++-
 libfrog/Makefile                |    6 
 libfrog/fiexchange.h            |  105 +++
 libfrog/file_exchange.c         |  184 ++++++
 libfrog/file_exchange.h         |   16 
 libfrog/fsgeom.c                |   45 +
 libfrog/fsgeom.h                |    7 
 libhandle/jdm.c                 |  117 ++++
 libxfs/Makefile                 |    2 
 libxfs/defer_item.c             |   79 ++
 libxfs/libxfs_priv.h            |   30 +
 libxfs/xfs_bmap.h               |    4 
 libxfs/xfs_defer.c              |    7 
 libxfs/xfs_defer.h              |    3 
 libxfs/xfs_errortag.h           |    4 
 libxfs/xfs_format.h             |   15 
 libxfs/xfs_fs.h                 |    2 
 libxfs/xfs_log_format.h         |   80 ++
 libxfs/xfs_sb.c                 |    3 
 libxfs/xfs_swapext.c            | 1256 +++++++++++++++++++++++++++++++++++++++
 libxfs/xfs_swapext.h            |  170 +++++
 libxfs/xfs_symlink_remote.c     |   47 +
 libxfs/xfs_symlink_remote.h     |    1 
 libxfs/xfs_trans_space.h        |    4 
 logprint/log_misc.c             |   11 
 logprint/log_print_all.c        |   12 
 logprint/log_redo.c             |  128 ++++
 logprint/logprint.h             |    6 
 m4/package_libcdev.m4           |   20 +
 man/man2/ioctl_xfs_fsgeometry.2 |    3 
 man/man8/xfs_io.8               |   87 +++
 43 files changed, 3183 insertions(+), 151 deletions(-)
 create mode 100644 io/atomicupdate.c
 create mode 100644 libfrog/fiexchange.h
 create mode 100644 libfrog/file_exchange.c
 create mode 100644 libfrog/file_exchange.h
 create mode 100644 libxfs/xfs_swapext.c
 create mode 100644 libxfs/xfs_swapext.h




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux