[XFS updates] XFS development tree branch, for-next, updated. v3.7-rc1-14-gd35e88f

xfs@xxxxxxxxxxx · Wed, 17 Oct 2012 14:18:32 -0500

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "XFS development tree".

The branch, for-next has been updated
  d35e88f xfs: only update the last_sync_lsn when a transaction completes
  33479e0 xfs: remove xfs_iget.c
  fa96aca xfs: move inode locking functions to xfs_inode.c
  6d8b79c xfs: rename xfs_sync.[ch] to xfs_icache.[ch]
  c75921a xfs: xfs_quiesce_attr() should quiesce the log like unmount
  c7eea6f xfs: move xfs_quiesce_attr() into xfs_super.c
  34061f5 xfs: xfs_sync_fsdata is redundant
  5889608 xfs: syncd workqueue is no more
  9aa0500 xfs: xfs_sync_data is redundant.
  cf2931d xfs: Bring some sanity to log unmounting
  f661f1e xfs: sync work is now only periodic log work
  7f7bebe xfs: don't run the sync work if the filesystem is read-only
  7e18530 xfs: rationalise xfs_mount_wq users
  33c7a2b xfs: xfs_syncd_stop must die
      from  ddffeb8c4d0331609ef2581d84de4d763607bd37 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit d35e88faa3b0fc2cea35c3b2dca358b5cd09b45f
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:12 2012 +1100

    xfs: only update the last_sync_lsn when a transaction completes

    The log write code stamps each iclog with the current tail LSN in
    the iclog header so that recovery knows where to find the tail of
    thelog once it has found the head. Normally this is taken from the
    first item on the AIL - the log item that corresponds to the oldest
    active item in the log.

    The problem is that when the AIL is empty, the tail lsn is dervied
    from the the l_last_sync_lsn, which is the LSN of the last iclog to
    be written to the log. In most cases this doesn't happen, because
    the AIL is rarely empty on an active filesystem. However, when it
    does, it opens up an interesting case when the transaction being
    committed to the iclog spans multiple iclogs.

    That is, the first iclog is stamped with the l_last_sync_lsn, and IO
    is issued. Then the next iclog is setup, the changes copied into the
    iclog (takes some time), and then the l_last_sync_lsn is stamped
    into the header and IO is issued. This is still the same
    transaction, so the tail lsn of both iclogs must be the same for log
    recovery to find the entire transaction to be able to replay it.

    The problem arises in that the iclog buffer IO completion updates
    the l_last_sync_lsn with it's own LSN. Therefore, If the first iclog
    completes it's IO before the second iclog is filled and has the tail
    lsn stamped in it, it will stamp the LSN of the first iclog into
    it's tail lsn field. If the system fails at this point, log recovery
    will not see a complete transaction, so the transaction will no be
    replayed.

    The fix is simple - the l_last_sync_lsn is updated when a iclog
    buffer IO completes, and this is incorrect. The l_last_sync_lsn
    shoul dbe updated when a transaction is completed by a iclog buffer
    IO. That is, only iclog buffers that have transaction commit
    callbacks attached to them should update the l_last_sync_lsn. This
    means that the last_sync_lsn will only move forward when a commit
    record it written, not in the middle of a large transaction that is
    rolling through multiple iclog buffers.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 33479e0542df066fb0b47df18780e93bfe6e0dc5
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:11 2012 +1100

    xfs: remove xfs_iget.c

    The inode cache functions remaining in xfs_iget.c can be moved to xfs_icache.c
    along with the other inode cache functions. This removes all functionality from
    xfs_iget.c, so the file can simply be removed.

    This move results in various functions now only having the scope of a single
    file (e.g. xfs_inode_free()), so clean up all the definitions and exported
    prototypes in xfs_icache.[ch] and xfs_inode.h appropriately.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit fa96acadf1eb712fca6d59922ad93787c87e44ec
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:10 2012 +1100

    xfs: move inode locking functions to xfs_inode.c

    xfs_ilock() and friends really aren't related to the inode cache in
    any way, so move them to xfs_inode.c with all the other inode
    related functionality.

    While doing this move, move the xfs_ilock() tracepoints to *before*
    the lock is taken so that when a hang on a lock occurs we have
    events to indicate which process and what inode we were trying to
    lock when the hang occurred. This is much better than the current
    silence we get on a hang...

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 6d8b79cfca39399ef9115fb65dde85993455c9a3
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:09 2012 +1100

    xfs: rename xfs_sync.[ch] to xfs_icache.[ch]

    xfs_sync.c now only contains inode reclaim functions and inode cache
    iteration functions. It is not related to sync operations anymore.
    Rename to xfs_icache.c to reflect it's contents and prepare for
    consolidation with the other inode cache file that exists
    (xfs_iget.c).

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit c75921a72a7c4bb73a5e09a697a672722e5543f1
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:08 2012 +1100

    xfs: xfs_quiesce_attr() should quiesce the log like unmount

    xfs_quiesce_attr() is supposed to leave the log empty with an
    unmount record written. Right now it does not wait for the AIL to be
    emptied before writing the unmount record, not does it wait for
    metadata IO completion, either. Fix it to use the same method and
    code as xfs_log_unmount().

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit c7eea6f7adca4501d2c2db7f0f7c9dc88efac95e
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:07 2012 +1100

    xfs: move xfs_quiesce_attr() into xfs_super.c

    Both callers of xfs_quiesce_attr() are in xfs_super.c, and there's
    nothing really sync-specific about this functionality so it doesn't
    really matter where it lives. Move it to benext to it's callers, so
    all the remount/sync_fs code is in the one place.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 34061f5c420561dd42addd252811a1fa4b0ac69b
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:06 2012 +1100

    xfs: xfs_sync_fsdata is redundant

    Why do we need to write the superblock to disk once we've written
    all the data?  We don't actually - the reasons for doing this are
    lost in the mists of time, and go back to the way Irix used to drive
    VFS flushing.

    On linux, this code is only called from two contexts: remount and
    .sync_fs. In the remount case, the call is followed by a metadata
    sync, which unpins and writes the superblock.  In the sync_fs case,
    we only need to force the log to disk to ensure that the superblock
    is correctly on disk, so we don't actually need to write it. Hence
    the functionality is either redundant or superfluous and thus can be
    removed.

    Seeing as xfs_quiesce_data is essentially now just a log force,
    remove it as well and fold the code back into the two callers.
    Neither of them need the log covering check, either, as that is
    redundant for the remount case, and unnecessary for the .sync_fs
    case.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 5889608df35783590251cfd440fa5d48f1855179
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:05 2012 +1100

    xfs: syncd workqueue is no more

    With the syncd functions moved to the log and/or removed, the syncd
    workqueue is the only remaining bit left. It is used by the log
    covering/ail pushing work, as well as by the inode reclaim work.

    Given how cheap workqueues are these days, give the log and inode
    reclaim work their own work queues and kill the syncd work queue.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 9aa05000f2b7cab4be582afba64af10b2d74727e
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:04 2012 +1100

    xfs: xfs_sync_data is redundant.

    We don't do any data writeback from XFS any more - the VFS is
    completely responsible for that, including for freeze. We can
    replace the remaining caller with a VFS level function that
    achieves the same thing, but without conflicting with current
    writeback work.

    This means we can remove the flush_work and xfs_flush_inodes() - the
    VFS functionality completely replaces the internal flush queue for
    doing this writeback work in a separate context to avoid stack
    overruns.

    This does have one complication - it cannot be called with page
    locks held.  Hence move the flushing of delalloc space when ENOSPC
    occurs back up into xfs_file_aio_buffered_write when we don't hold
    any locks that will stall writeback.

    Unfortunately, writeback_inodes_sb_if_idle() is not sufficient to
    trigger delalloc conversion fast enough to prevent spurious ENOSPC
    whent here are hundreds of writers, thousands of small files and GBs
    of free RAM.  Hence we need to use sync_sb_inodes() to block callers
    while we wait for writeback like the previous xfs_flush_inodes
    implementation did.

    That means we have to hold the s_umount lock here, but because this
    call can nest inside i_mutex (the parent directory in the create
    case, held by the VFS), we have to use down_read_trylock() to avoid
    potential deadlocks. In practice, this trylock will succeed on
    almost every attempt as unmount/remount type operations are
    exceedingly rare.

    Note: we always need to pass a count of zero to
    generic_file_buffered_write() as the previously written byte count.
    We only do this by accident before this patch by the virtue of ret
    always being zero when there are no errors. Make this explicit
    rather than needing to specifically zero ret in the ENOSPC retry
    case.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Tested-by: Brian Foster <bfoster@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit cf2931db2d189ce0583be7ae880d7e3f8c15f623
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:03 2012 +1100

    xfs: Bring some sanity to log unmounting

    When unmounting the filesystem, there are lots of operations that
    need to be done in a specific order, and they are spread across
    across a couple of functions. We have to drain the AIL before we
    write the unmount record, and we have to shut down the background
    log work before we do either of them.

    But this is all split haphazardly across xfs_unmountfs() and
    xfs_log_unmount(). Move all the AIL flushing and log manipulations
    to xfs_log_unmount() so that the responisbilities of each function
    is clear and the operations they perform obvious.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit f661f1e0bf5002bdcc8b5810ad0a184a1841537f
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:02 2012 +1100

    xfs: sync work is now only periodic log work

    The only thing the periodic sync work does now is flush the AIL and
    idle the log. These are really functions of the log code, so move
    the work to xfs_log.c and rename it appropriately.

    The only wart that this leaves behind is the xfssyncd_centisecs
    sysctl, otherwise the xfssyncd is dead. Clean up any comments that
    related to xfssyncd to reflect it's passing.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 7f7bebefba152c5bdfe961cd2e97e8695a32998c
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:01 2012 +1100

    xfs: don't run the sync work if the filesystem is read-only

    If the filesystem is mounted or remounted read-only, stop the sync
    worker that tries to flush or cover the log if the filesystem is
    dirty. It's read-only, so it isn't dirty. Restart it on a remount,rw
    as necessary. This avoids the need for RO checks in the work.

    Similarly, stop the sync work when the filesystem is frozen, and
    start it again when the filesysetm is thawed. This avoids the need
    for special freeze checks in the work.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 7e18530bef6a18a5479690ae7e8256319ecf1300
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:56:00 2012 +1100

    xfs: rationalise xfs_mount_wq users

    Instead of starting and stopping background work on the xfs_mount_wq
    all at the same time, separate them to where they really are needed
    to start and stop.

    The xfs_sync_worker, only needs to be started after all the mount
    processing has completed successfully, while it needs to be stopped
    before the log is unmounted.

    The xfs_reclaim_worker is started on demand, and can be
    stopped before the unmount process does it's own inode reclaim pass.

    The xfs_flush_inodes work is run on demand, and so we really only
    need to ensure that it has stopped running before we start
    processing an unmount, freeze or remount,ro.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 33c7a2bc48a81fa714572f8ce29f29bc17e6faf0
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Mon Oct 8 21:55:59 2012 +1100

    xfs: xfs_syncd_stop must die

    xfs_syncd_start and xfs_syncd_stop tie a bunch of unrelated
    functionailty together that actually have different start and stop
    requirements. Kill these functions and open code the start/stop
    methods for each of the background functions.

    Subsequent patches will move the start/stop functions around to the
    correct places to avoid races and shutdown issues.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

-----------------------------------------------------------------------

Summary of changes:
 fs/xfs/Makefile                     |    3 +-
 fs/xfs/xfs_export.c                 |    1 +
 fs/xfs/xfs_file.c                   |   13 +-
 fs/xfs/{xfs_sync.c => xfs_icache.c} |  701 ++++++++++++++++++++--------------
 fs/xfs/{xfs_sync.h => xfs_icache.h} |   14 +-
 fs/xfs/xfs_iget.c                   |  705 -----------------------------------
 fs/xfs/xfs_inode.c                  |  251 +++++++++++++
 fs/xfs/xfs_inode.h                  |   10 +-
 fs/xfs/xfs_iomap.c                  |   23 +-
 fs/xfs/xfs_itable.c                 |    1 +
 fs/xfs/xfs_log.c                    |  122 +++++-
 fs/xfs/xfs_log.h                    |    4 +
 fs/xfs/xfs_log_priv.h               |    1 +
 fs/xfs/xfs_log_recover.c            |    1 +
 fs/xfs/xfs_mount.c                  |   31 +-
 fs/xfs/xfs_mount.h                  |    6 +-
 fs/xfs/xfs_qm.c                     |    1 +
 fs/xfs/xfs_qm_syscalls.c            |    1 +
 fs/xfs/xfs_rtalloc.c                |    1 +
 fs/xfs/xfs_super.c                  |  139 ++++---
 fs/xfs/xfs_super.h                  |    1 +
 fs/xfs/xfs_vnodeops.c               |    3 +-
 22 files changed, 919 insertions(+), 1114 deletions(-)
 rename fs/xfs/{xfs_sync.c => xfs_icache.c} (64%)
 rename fs/xfs/{xfs_sync.h => xfs_icache.h} (73%)
 delete mode 100644 fs/xfs/xfs_iget.c

hooks/post-receive
-- 
XFS development tree

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs