[PATCH 0/9] introduce defrag to xfs_spaceman

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This patch set introduces defrag to xfs_spaceman command. It has the functionality and
features below (also subject to be added to man page, so please review):

       defrag [-f free_space] [-i idle_time] [-s segment_size] [-n] [-a]
              defrag defragments the specified XFS file online non-exclusively. The target XFS
              doesn't have to (and must not) be unmunted.  When defragmentation is in progress, file
              IOs are served 'in parallel'.  reflink feature must be enabled in the XFS.

              Defragmentation and file IOs

              The target file is virtually devided into many small segments. Segments are the
              smallest units for defragmentation. Each segment is defragmented one by one in a
              lock->defragment->unlock->idle manner. File IOs are blocked when the target file is
              locked and are served during the defragmentation idle time (file is unlocked). Though
              the file IOs can't really go in parallel, they are not blocked long. The locking time
              basically depends on the segment size. Smaller segments usually take less locking time
              and thus IOs are blocked shorterly, bigger segments usually need more locking time and
              IOs are blocked longer. Check -s and -i options to balance the defragmentation and IO
              service.

              Temporary file

              A temporary file is used for the defragmentation. The temporary file is created in the
              same directory as the target file is and is named ".xfsdefrag_<pid>". It is a sparse
              file and contains a defragmentation segment at a time. The temporary file is removed
              automatically when defragmentation is done or is cancelled by ctrl-c. It remains in
              case kernel crashes when defragmentation is going on. In that case, the temporary file
              has to be removed manaully.

              Free blocks consumption

              Defragmenation works by (trying) allocating new (contiguous) blocks, copying data and
              then freeing old (non-contig) blocks. Usually the number of old blocks to free equals
              to the number the newly allocated blocks. As a finally result, defragmentation doesn't
              consume free blocks. Well, that is true if the target file is not sharing blocks with
              other files.  In case the target file contains shared blocks, those shared blocks won't
              be freed back to filesystem as they are still owned by other files. So defragmenation
              allocates more blocks than it frees.  For existing XFS, free blocks might be over-
              committed when reflink snapshots were created. To avoid causing the XFS running into
              low free blocks state, this defragmentation excludes (partially) shared segments when
              the file system free blocks reaches a shreshold. Check the -f option.

              Safty and consistency

              The defragmentation file is guanrantted safe and data consistent for ctrl-c and kernel
              crash.

              First extent share

              Current kernel has routine for each segment defragmentation detecting if the file is
              sharing blocks. It takes long in case the target file contains huge number of extents
              and the shared ones, if there is, are at the end. The First extent share feature works
              around above issue by making the first serveral blocks shared. Seeing the first blocks
              are shared, the kernel routine ends quickly. The side effect is that the "share" flag
              would remain traget file. This feature is enabled by default and can be disabled by -n
              option.

              extsize and cowextsize

              According to kernel implementation, extsize and cowextsize could have following impacts
              to defragmentation: 1) non-zero extsize causes separated block allocations for each
              extent in the segment and those blocks are not contiguous. The segment remains same
              number of extents after defragmention (no effect).  2) When extsize and/or cowextsize
              are too big, a lot of pre-allocated blocks remain in memory for a while. When new IO
              comes to whose pre-allocated blocks  Copy on Write happens and causes the file
              fragmented.

              Readahead

              Readahead tries to fetch the data blocks for next segment with less locking in
              backgroud during idle time. This feature is disabled by default, use -a to enable it.

              The command takes the following options:
                 -f free_space
                     The shreshold of XFS free blocks in MiB. When free blocks are less than this
                     number, (partially) shared segments are excluded from defragmentation. Default
                     number is 1024

                 -i idle_time
                     The time in milliseconds, defragmentation enters idle state for this long after
                     defragmenting a segment and before handing the next. Default number is TOBEDONE.

                 -s segment_size
                     The size limitation in bytes of segments. Minimium number is 4MiB, default
                     number is 16MiB.

                 -n  Disable the First extent share feature. Enabled by default.

                 -a  Enable readahead feature, disabled by default.

We tested with real customer metadump with some different 'idle_time's and found 250ms is good pratice
sleep time. Here comes some number of the test:

Test: running of defrag on the image file which is used for the back end of a block device in a
      virtual machine. At the same time, fio is running at the same time inside virtual machine
      on that block device.
block device type:   NVME
File size:           200GiB
paramters to defrag: free_space: 1024 idle_time: 250 First_extent_share: enabled readahead: disabled
Defrag run time:     223 minutes
Number of extents:   6745489(before) -> 203571(after)
Fio read latency:    15.72ms(without defrag) -> 14.53ms(during defrag)
Fio write latency:   32.21ms(without defrag) -> 20.03ms(during defrag)


Wengang Wang (9):
  xfsprogs: introduce defrag command to spaceman
  spaceman/defrag: pick up segments from target file
  spaceman/defrag: defrag segments
  spaceman/defrag: ctrl-c handler
  spaceman/defrag: exclude shared segments on low free space
  spaceman/defrag: workaround kernel xfs_reflink_try_clear_inode_flag()
  spaceman/defrag: sleeps between segments
  spaceman/defrag: readahead for better performance
  spaceman/defrag: warn on extsize

 spaceman/Makefile |   2 +-
 spaceman/defrag.c | 788 ++++++++++++++++++++++++++++++++++++++++++++++
 spaceman/init.c   |   1 +
 spaceman/space.h  |   1 +
 4 files changed, 791 insertions(+), 1 deletion(-)
 create mode 100644 spaceman/defrag.c

-- 
2.39.3 (Apple Git-146)





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux