[XFS updates] XFS development tree branch, for-next, updated. xfs-for-linus-v3.13-rc1-4-g2fe8c1c

xfs@xxxxxxxxxxx · Mon, 18 Nov 2013 14:31:41 -0600 (CST)

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "XFS development tree".

The branch, for-next has been updated
  2fe8c1c xfs: open code inc_inode_iversion when logging an inode
  8f80587 xfs: increase inode cluster size for v5 filesystems
  9e3908e xfs: fix unlock in xfs_bmap_add_attrfork
      from  ec715cacd53f63b21da3a9bc96a9bb4c527e25b1 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 2fe8c1c08b3fbd87dd2641c8f032ff6e965d5803
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Fri Nov 1 15:27:17 2013 +1100

    xfs: open code inc_inode_iversion when logging an inode

    Michael L Semon reported that generic/069 runtime increased on v5
    superblocks by 100% compared to v4 superblocks. his perf-based
    analysis pointed directly at the timestamp updates being done by the
    write path in this workload. The append writers are doing 4-byte
    writes, so there are lots of timestamp updates occurring.

    The thing is, they aren't being triggered by timestamp changes -
    they are being triggered by the inode change counter needing to be
    updated. That is, every write(2) system call needs to bump the inode
    version count, and it does that through the timestamp update
    mechanism. Hence for v5 filesystems, test generic/069 is running 3
    orders of magnitude more timestmap update transactions on v5
    filesystems due to the fact it does a huge number of *4 byte*
    write(2) calls.

    This isn't a real world scenario we really need to address - anyone
    doing such sequential IO should be using fwrite(3), not write(2).
    i.e. fwrite(3) buffers the writes in userspace to minimise the
    number of write(2) syscalls, and the problem goes away.

    However, there is a small change we can make to improve the
    situation - removing the expensive lock operation on the change
    counter update.  All inode version counter changes in XFS occur
    under the ip->i_ilock during a transaction, and therefore we
    don't actually need the spin lock that provides exclusive access to
    it through inc_inode_iversion().

    Hence avoid the lock and just open code the increment ourselves when
    logging the inode.

    Reported-by: Michael L. Semon <mlsemon35@xxxxxxxxx>
    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 8f80587bacb6eb893df0ee4e35fefa0dfcfdf9f4
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Fri Nov 1 15:27:20 2013 +1100

    xfs: increase inode cluster size for v5 filesystems

    v5 filesystems use 512 byte inodes as a minimum, so read inodes in
    clusters that are effectively half the size of a v4 filesystem with
    256 byte inodes. For v5 fielsystems, scale the inode cluster size
    with the size of the inode so that we keep a constant 32 inodes per
    cluster ratio for all inode IO.

    This only works if mkfs.xfs sets the inode alignment appropriately
    for larger inode clusters, so this functionality is made conditional
    on mkfs doing the right thing. xfs_repair needs to know about
    the inode alignment changes, too.

    Wall time:
    	create	bulkstat	find+stat	ls -R	unlink
    v4	237s	161s		173s		201s	299s
    v5	235s	163s		205s		 31s	356s
    patched	234s	160s		182s		 29s	317s

    System time:
    	create	bulkstat	find+stat	ls -R	unlink
    v4	2601s	2490s		1653s		1656s	2960s
    v5	2637s	2497s		1681s		  20s	3216s
    patched	2613s	2451s		1658s		  20s	3007s

    So, wall time same or down across the board, system time same or
    down across the board, and cache hit rates all improve except for
    the ls -R case which is a pure cold cache directory read workload
    on v5 filesystems...

    So, this patch removes most of the performance and CPU usage
    differential between v4 and v5 filesystems on traversal related
    workloads.

    Note: while this patch is currently for v5 filesystems only, there
    is no reason it can't be ported back to v4 filesystems.  This hasn't
    been done here because bringing the code back to v4 requires
    forwards and backwards kernel compatibility testing.  i.e. to
    deterine if older kernels(*) do the right thing with larger inode
    alignments but still only using 8k inode cluster sizes. None of this
    testing and validation on v4 filesystems has been done, so for the
    moment larger inode clusters is limited to v5 superblocks.

    (*) a current default config v4 filesystem should mount just fine on
    2.6.23 (when lazy-count support was introduced), and so if we change
    the alignment emitted by mkfs without a feature bit then we have to
    make sure it works properly on all kernels since 2.6.23. And if we
    allow it to be changed when the lazy-count bit is not set, then it's
    all kernels since v2 logs were introduced that need to be tested for
    compatibility...

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Eric Sandeen <sandeen@xxxxxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

commit 9e3908e342eba6684621e616529669c17e271e7e
Author: Mark Tinguely <tinguely@xxxxxxx>
Date:   Thu Nov 7 15:43:28 2013 -0600

    xfs: fix unlock in xfs_bmap_add_attrfork

    xfs_trans_ijoin() activates the inode in a transaction and
    also can specify which lock to free when the transaction is
    committed or canceled.

    xfs_bmap_add_attrfork call locks and adds the lock to the
    transaction but also manually removes the lock. Change the
    routine to not add the lock to the transaction and manually
    remove lock on completion.

    While here, clean up the xfs_trans_cancel flags and goto names.

    Signed-off-by: Mark Tinguely <tinguely@xxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

-----------------------------------------------------------------------

Summary of changes:
 fs/xfs/xfs_bmap.c        | 38 +++++++++++++++++++++-----------------
 fs/xfs/xfs_mount.c       | 15 +++++++++++++++
 fs/xfs/xfs_mount.h       |  2 +-
 fs/xfs/xfs_trans_inode.c |  8 +++++---
 fs/xfs/xfs_trans_resv.c  |  3 +--
 5 files changed, 43 insertions(+), 23 deletions(-)

hooks/post-receive
-- 
XFS development tree

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs