[XFS updates] XFS development tree branch, master, updated. v2.6.37-rc4-67-g73efe4a

xfs@xxxxxxxxxxx · Wed, 12 Jan 2011 09:37:25 -0600

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "XFS development tree".

The branch, master has been updated
  73efe4a xfs: prevent NMI timeouts in cmn_err
  65a84a0 xfs: Add log level to assertion printk
  1884bd8 xfs: fix an assignment within an ASSERT()
  bfc6017 xfs: fix error handling for synchronous writes
  a46db60 xfs: add FITRIM support
  c58efdb xfs: ensure log covering transactions are synchronous
  eda7798 xfs: serialise unaligned direct IOs
  4d8d158 xfs: factor common write setup code
  637bbc7 xfs: split buffered IO write path from xfs_file_aio_write
  f0d26e8 xfs: split direct IO write path from xfs_file_aio_write
  487f84f xfs: introduce xfs_rw_lock() helpers for locking the inode
  4c5cfd1 xfs: factor post-write newsize updates
  edafb6d xfs: factor common post-write isize handling code
  a363f0c xfs: ensure sync write errors are returned
      from  d0eb2f38b250b7d6c993adf81b0e4ded0565497e (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 73efe4a4ddf8eb2b1cc7039e8a66a23a424961af
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Wed Jan 12 00:35:42 2011 +0000

    xfs: prevent NMI timeouts in cmn_err

    We currently have a global error message buffer in cmn_err that is
    protected by a spin lock that disables interrupts.  Recently there
    have been reports of NMI timeouts occurring when the console is
    being flooded by SCSI error reports due to cmn_err() getting stuck
    trying to print to the console while holding this lock (i.e. with
    interrupts disabled). The NMI watchdog is seeing this CPU as
    non-responding and so is triggering a panic.  While the trigger for
    the reported case is SCSI errors, pretty much anything that spams
    the kernel log could cause this to occur.

    Realistically the only reason that we have the intemediate message
    buffer is to prepend the correct kernel log level prefix to the log
    message. The only reason we have the lock is to protect the global
    message buffer and the only reason the message buffer is global is
    to keep it off the stack. Hence if we can avoid needing a global
    message buffer we avoid needing the lock, and we can do this with a
    small amount of cleanup and some preprocessor tricks:

    	1. clean up xfs_cmn_err() panic mask functionality to avoid
    	   needing debug code in xfs_cmn_err()
    	2. remove the couple of "!" message prefixes that still exist that
    	   the existing cmn_err() code steps over.
    	3. redefine CE_* levels directly to KERN_*
    	4. redefine cmn_err() and friends to use printk() directly
    	   via variable argument length macros.

    By doing this, we can completely remove the cmn_err() code and the
    lock that is causing the problems, and rely solely on printk()
    serialisation to ensure that we don't get garbled messages.

    A series of followup patches is really needed to clean up all the
    cmn_err() calls and related messages properly, but that results in a
    series that is not easily back portable to enterprise kernels. Hence
    this initial fix is only to address the direct problem in the lowest
    impact way possible.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Alex Elder <aelder@xxxxxxx>

commit 65a84a0f7567ea244e5246e642920260cfc2744a
Author: Anton Blanchard <anton@xxxxxxxxx>
Date:   Fri Jan 7 03:30:41 2011 +0000

    xfs: Add log level to assertion printk

    I received a ppc64 bug report involving xfs but the assertion was
    filtered out by the console log level. Use KERN_CRIT to ensure it
    makes it out.

    Signed-off-by: Anton Blanchard <anton@xxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Alex Elder <aelder@xxxxxxx>

commit 1884bd8354c9aec4ca501dc4773c13ad2a09af7b
Author: Jesper Juhl <jj@xxxxxxxxxxxxx>
Date:   Sat Dec 25 20:14:53 2010 +0000

    xfs: fix an assignment within an ASSERT()

    In fs/xfs/xfs_trans.c::xfs_trans_unreserve_and_mod_sb() at the out:
    label we have this:
    	ASSERT(error = 0);
    I believe a comparison was intended, not an assignment. If I'm
    right, the patch below fixes that up.

    Signed-off-by: Jesper Juhl <jj@xxxxxxxxxxxxx>
    Signed-off-by: Alex Elder <aelder@xxxxxxx>

commit bfc60177f8ab509bc225becbb58f7e53a0e33e81
Author: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date:   Fri Jan 7 13:02:23 2011 +0000

    xfs: fix error handling for synchronous writes

    If we get an IO error on a synchronous superblock write, we attach an
    error release function to it so that when the last reference goes away
    the release function is called and the buffer is invalidated and
    unlocked. The buffer is left locked until the release function is
    called so that other concurrent users of the buffer will be locked out
    until the buffer error is fully processed.

    Unfortunately, for the superblock buffer the filesyetm itself holds a
    reference to the buffer which prevents the reference count from
    dropping to zero and the release function being called. As a result,
    once an IO error occurs on a sync write, the buffer will never be
    unlocked and all future attempts to lock the buffer will hang.

    To make matters worse, this problems is not unique to such buffers;
    if there is a concurrent _xfs_buf_find() running, the lookup will grab
    a reference to the buffer and then wait on the buffer lock, preventing
    the reference count from ever falling to zero and hence unlocking the
    buffer.

    As such, the whole b_relse function implementation is broken because it
    cannot rely on the buffer reference count falling to zero to unlock the
    errored buffer. The synchronous write error path is the only path that
    uses this callback - it is used to ensure that the synchronous waiter
    gets the buffer error before the error state is cleared from the buffer
    by the release function.

    Given that the only sychronous buffer writes now go through xfs_bwrite
    and the error path in question can only occur for a write of a dirty,
    logged buffer, we can move most of the b_relse processing to happen
    inline in xfs_buf_iodone_callbacks, just like a normal I/O completion.
    In addition to that we make sure the error is not cleared in
    xfs_buf_iodone_callbacks, so that xfs_bwrite can reliably check it.
    Given that xfs_bwrite keeps the buffer locked until it has waited for
    it and checked the error this allows to reliably propagate the error
    to the caller, and make sure that the buffer is reliably unlocked.

    Given that xfs_buf_iodone_callbacks was the only instance of the
    b_relse callback we can remove it entirely.

    Based on earlier patches by Dave Chinner and Ajeet Yadav.

    Signed-off-by: Christoph Hellwig <hch@xxxxxx>
    Reported-by: Ajeet Yadav <ajeet.yadav.77@xxxxxxxxx>
    Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Signed-off-by: Alex Elder <aelder@xxxxxxx>

commit a46db60834883c1c8c665d7fcc7b4ab66f5966fc
Author: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date:   Fri Jan 7 13:02:04 2011 +0000

    xfs: add FITRIM support

    Allow manual discards from userspace using the FITRIM ioctl.  This is not
    intended to be run during normal workloads, as the freepsace btree walks
    can cause large performance degradation.

    Signed-off-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Signed-off-by: Alex Elder <aelder@xxxxxxx>

commit c58efdb442bb49dea1d148f207560c41918c1bf4
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Tue Jan 4 04:49:29 2011 +0000

    xfs: ensure log covering transactions are synchronous

    To ensure the log is covered and the filesystem idles correctly, we
    need to ensure that dummy transactions hit the disk and do not stay
    pinned in memory.  If the superblock is pinned in memory, it can't
    be flushed so the log covering cannot make progress. The result is
    dependent on timing - more oftent han not we continue to issues a
    log covering transaction every 36s rather than idling after ~90s.

    Fix this by making the log covering transaction synchronous. To
    avoid additional log force from xfssyncd, make the log covering
    transaction take the place of the existing log force in the xfssyncd
    background sync process.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Alex Elder <aelder@xxxxxxx>

commit eda77982729b7170bdc9e8855f0682edf322d277
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Tue Jan 11 10:22:40 2011 +1100

    xfs: serialise unaligned direct IOs

    When two concurrent unaligned, non-overlapping direct IOs are issued
    to the same block, the direct Io layer will race to zero the block.
    The result is that one of the concurrent IOs will overwrite data
    written by the other IO with zeros. This is demonstrated by the
    xfsqa test 240.

    To avoid this problem, serialise all unaligned direct IOs to an
    inode with a big hammer. We need a big hammer approach as we need to
    serialise AIO as well, so we can't just block writes on locks.
    Hence, the big hammer is calling xfs_ioend_wait() while holding out
    other unaligned direct IOs from starting.

    We don't bother trying to serialised aligned vs unaligned IOs as
    they are overlapping IO and the result of concurrent overlapping IOs
    is undefined - the result of either IO is a valid result so we let
    them race. Hence we only penalise unaligned IO, which already has a
    major overhead compared to aligned IO so this isn't a major problem.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>

commit 4d8d15812fd9bc96d0da11467d23e0373feae933
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Tue Jan 11 10:23:42 2011 +1100

    xfs: factor common write setup code

    The buffered IO and direct IO write paths share a common set of
    checks and limiting code prior to issuing the write. Factor that
    into a common helper function.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>

commit 637bbc75d9fda57c7bc77ce5ee37e29a77a0520d
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Tue Jan 11 10:17:30 2011 +1100

    xfs: split buffered IO write path from xfs_file_aio_write

    Complete the split of the different write IO paths by splitting the
    buffered IO write path out of xfs_file_aio_write(). This makes the
    different mechanisms of the write patchs easier to follow.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>

commit f0d26e860b6c496464c5c8165d7df08dabde01fa
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Tue Jan 11 10:15:36 2011 +1100

    xfs: split direct IO write path from xfs_file_aio_write

    The current xfs_file_aio_write code is a mess of locking shenanigans
    to handle the different locking requirements of buffered and direct
    IO. Start to clean this up by disentangling the direct IO path from
    the mess.

    This also removes the failed direct IO fallback path to buffered IO.
    XFS handles all direct IO cases without needing to fall back to
    buffered IO, so we can safely remove this unused path. This greatly
    simplifies the logic and locking needed in the write path.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>

commit 487f84f3f80bc6f00c59725e822653d3ec174b85
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Wed Jan 12 11:37:10 2011 +1100

    xfs: introduce xfs_rw_lock() helpers for locking the inode

    We need to obtain the i_mutex, i_iolock and i_ilock during the read
    and write paths. Add a set of wrapper functions to neatly
    encapsulate the lock ordering and shared/exclusive semantics to make
    the locking easier to follow and get right.

    Note that this changes some of the exclusive locking serialisation in
    that serialisation will occur against the i_mutex instead of the
    XFS_IOLOCK_EXCL. This does not change any behaviour, and it is
    arguably more efficient to use the mutex for such serialisation than
    the rw_sem.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>

commit 4c5cfd1b4157fb75d43b44a147c2feba6422fc4f
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Tue Jan 11 10:14:16 2011 +1100

    xfs: factor post-write newsize updates

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>

commit edafb6da9aa725e4de5fe758fe81644b6167f9a2
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Tue Jan 11 10:14:06 2011 +1100

    xfs: factor common post-write isize handling code

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>

commit a363f0c2030cb9781e7e458f4a9e354b6c43d7ce
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Tue Jan 11 10:13:53 2011 +1100

    xfs: ensure sync write errors are returned

    xfs_file_aio_write() only returns the error from synchronous
    flushing of the data and inode if error == 0. At the point where
    error is being checked, it is guaranteed to be > 0. Therefore any
    errors returned by the data or fsync flush will never be returned.
    Fix the checks so we overwrite the current error once and only if an
    error really occurred.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>

-----------------------------------------------------------------------

Summary of changes:
 fs/xfs/Makefile                |    1 +
 fs/xfs/linux-2.6/xfs_buf.c     |    7 +-
 fs/xfs/linux-2.6/xfs_buf.h     |    7 +-
 fs/xfs/linux-2.6/xfs_discard.c |  191 ++++++++++++++
 fs/xfs/linux-2.6/xfs_discard.h |    8 +
 fs/xfs/linux-2.6/xfs_file.c    |  535 +++++++++++++++++++++++-----------------
 fs/xfs/linux-2.6/xfs_ioctl.c   |    3 +
 fs/xfs/linux-2.6/xfs_super.c   |    2 +-
 fs/xfs/linux-2.6/xfs_sync.c    |   11 +-
 fs/xfs/linux-2.6/xfs_sysctl.c  |   23 ++-
 fs/xfs/linux-2.6/xfs_trace.h   |   33 +++
 fs/xfs/support/debug.c         |  112 ++++-----
 fs/xfs/support/debug.h         |   25 ++-
 fs/xfs/xfs_alloc.c             |   10 +-
 fs/xfs/xfs_alloc.h             |   25 ++-
 fs/xfs/xfs_buf_item.c          |  151 ++++--------
 fs/xfs/xfs_error.c             |   31 ---
 fs/xfs/xfs_error.h             |   18 +-
 fs/xfs/xfs_fsops.c             |   10 +-
 fs/xfs/xfs_fsops.h             |    2 +-
 fs/xfs/xfs_log.c               |    2 +-
 fs/xfs/xfs_log_recover.c       |    2 +-
 fs/xfs/xfs_trans.c             |    2 +-
 23 files changed, 729 insertions(+), 482 deletions(-)
 create mode 100644 fs/xfs/linux-2.6/xfs_discard.c
 create mode 100644 fs/xfs/linux-2.6/xfs_discard.h

hooks/post-receive
-- 
XFS development tree

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs