This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "XFS development tree". The branch, master has been updated 7346e11 xfs simplify and speed up direct I/O completions c203b1d xfs: move aio completion after unwritten extent conversion 9ecd1fb direct-io: move aio_complete into ->end_io d2ec2c7 xfs: clean up xfs_bmap_get_bp 0af00e4 xfs: simplify xfs_truncate_file ce15629 xfs: kill the b_strat callback in xfs_buf 1b95090 xfs: remove obsolete osyncisosync mount option 9e69683 xfs: clean up filestreams helpers 0fd7275 xfs: fix gcc 4.6 set but not read and unused statement warnings 9625169 xfs: Fix build when CONFIG_XFS_POSIX_ACL=n 4d423a9 xfs: fix unsigned underflow in xfs_free_eofblocks 6f6b39e xfs: use GFP_NOFS for page cache allocation 6af128f xfs: fix memory reclaim recursion deadlock on locked inode buffer 4b3ac9d xfs: fix xfs_trans_add_item() lockdep warnings a424144 xfs: simplify and remove xfs_ireclaim 57bf7d8 xfs: don't block on buffer read errors 8687ad5 Merge branch 'master' into for-2.6.36 16fd536 xfs: track AGs with reclaimable inodes in per-ag radix tree 70e60ce xfs: convert inode shrinker to per-filesystem contexts 7f8275d mm: add context argument to shrinker callback from 0fef16d86bfc1f74fb55e0f755adc3b5d8a3e84f (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 7346e1197eb76e22199b6b4625f129331e0fd7ac Author: Christoph Hellwig <hch@xxxxxxxxxxxxx> Date: Sun Jul 18 21:17:11 2010 +0000 xfs simplify and speed up direct I/O completions Our current handling of direct I/O completions is rather suboptimal, because we defer it to a workqueue more often than needed, and we perform a much to aggressive flush of the workqueue in case unwritten extent conversions happen. This patch changes the direct I/O reads to not even use a completion handler, as we don't bother to use it at all, and to perform the unwritten extent conversions in caller context for synchronous direct I/O. For a small I/O size direct I/O workload on a consumer grade SSD, such as the untar of a kernel tree inside qemu this patch gives speedups of about 5%. Getting us much closer to the speed of a native block device, or a fully allocated XFS file. Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Alex Elder <aelder@xxxxxxx> commit c203b1d4c7f27368df5d6211bf9a621acf32904b Author: Christoph Hellwig <hch@xxxxxxxxxxxxx> Date: Sun Jul 18 21:17:10 2010 +0000 xfs: move aio completion after unwritten extent conversion If we write into an unwritten extent using AIO we need to complete the AIO request after the extent conversion has finished. Without that a read could race to see see the extent still unwritten and return zeros. For synchronous I/O we already take care of that by flushing the xfsconvertd workqueue (which might be a bit of overkill). To do that add iocb and result fields to struct xfs_ioend, so that we can call aio_complete from xfs_end_io after the extent conversion has happened. Note that we need a new result field as io_error is used for positive errno values, while the AIO code can return negative error values and positive transfer sizes. Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Alex Elder <aelder@xxxxxxx> commit 9ecd1fbd13f2518fb7076516300bdaa4d644e6c0 Author: Christoph Hellwig <hch@xxxxxxxxxxxxx> Date: Sun Jul 18 21:17:09 2010 +0000 direct-io: move aio_complete into ->end_io Filesystems with unwritten extent support must not complete an AIO request until the transaction to convert the extent has been commited. That means the aio_complete calls needs to be moved into the ->end_io callback so that the filesystem can control when to call it exactly. This makes a bit of a mess out of dio_complete and the ->end_io callback prototype even more complicated. Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Jan Kara <jack@xxxxxxx> Signed-off-by: Alex Elder <aelder@xxxxxxx> commit d2ec2c790b3994bf0c4381cacc489b514705887c Author: Christoph Hellwig <hch@xxxxxxxxxxxxx> Date: Thu Jul 22 12:52:08 2010 +1000 xfs: clean up xfs_bmap_get_bp Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 0af00e497915cfdbef86298a7541d294c629c4aa Author: Christoph Hellwig <hch@xxxxxxxxxxxxx> Date: Tue Jul 20 17:51:31 2010 +1000 xfs: simplify xfs_truncate_file xfs_truncate_file is only used for truncating quota files. Move it to xfs_qm_syscalls.c so it can be marked static and take advatange of the fact by removing the unused page cache validation and taking the iget into the helper. Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit ce15629a36583e132397fff67f716b80e618b51c Author: Christoph Hellwig <hch@xxxxxxxxxxxxx> Date: Tue Jul 20 17:51:16 2010 +1000 xfs: kill the b_strat callback in xfs_buf The b_strat callback is used by xfs_buf_iostrategy to perform additional checks before submitting a buffer. It is used in xfs_bwrite and when writing out delayed buffers. In xfs_bwrite it we can de-virtualize the call easily as b_strat is set a few lines above the call to xfs_buf_iostrategy. For the delayed buffers the rationale is a bit more complicated: - there are three callers of xfs_buf_delwri_queue, which places buffers on the delwri list: (1) xfs_bdwrite - this sets up b_strat, so it's fine (2) xfs_buf_iorequest. None of the callers can have XBF_DELWRI set: - xlog_bdstrat is only used for log buffers, which are never delwri - _xfs_buf_read explicitly clears the delwri flag - xfs_buf_iodone_work retries log buffers only - xfsbdstrat - only used for reads, superblock writes without the delwri flag, log I/O and file zeroing with explicitly allocated buffers. - xfs_buf_iostrategy - only calls xfs_buf_iorequest if b_strat is not set (3) xfs_buf_unlock - only puts the buffer on the delwri list if the DELWRI flag is already set. The DELWRI flag is only ever set in xfs_bwrite, xfs_buf_iodone_callbacks, or xfs_trans_log_buf. For xfs_buf_iodone_callbacks and xfs_trans_log_buf we require an initialized buf item, which means b_strat was set to xfs_bdstrat_cb in xfs_buf_item_init. Conclusion: we can just get rid of the callback and replace it with explicit calls to xfs_bdstrat_cb. Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 1b95090f3d3a554a8c018539b8dde1425bdb0a4e Author: Christoph Hellwig <hch@xxxxxxxxxxxxx> Date: Tue Jul 20 17:50:52 2010 +1000 xfs: remove obsolete osyncisosync mount option Since Linux 2.6.33 the kernel has support for real O_SYNC, which made the osyncisosync option a no-op. Warn the users about this and remove the mount flag for it. Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 9e69683f392eb5b643f632122c4a6384bd5a9a82 Author: Christoph Hellwig <hch@xxxxxxxxxxxxx> Date: Tue Jul 20 17:31:01 2010 +1000 xfs: clean up filestreams helpers Move xfs_filestream_peek_ag, xxfs_filestream_get_ag and xfs_filestream_put_ag from xfs_filestream.h to xfs_filestream.c where it's only callers are, and remove the inline marker while we're at it to let the compiler decide on the inlining. Also don't return a value from xfs_filestream_put_ag because we don't need it. Signed-off-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 0fd7275cc42ab734eaa1a2c747e65479bd1e42af Author: Christoph Hellwig <hch@xxxxxxxxxxxxx> Date: Tue Jul 20 17:54:45 2010 +1000 xfs: fix gcc 4.6 set but not read and unused statement warnings [hch: dropped a few hunks that need structural changes instead] Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx> Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 96251696da0ed81b1e201f8a743ce65ea295c499 Author: Tony Luck <tony.luck@xxxxxxxxx> Date: Tue Jul 20 17:54:41 2010 +1000 xfs: Fix build when CONFIG_XFS_POSIX_ACL=n When CONFIG_XFS_POSIX_ACL is not set "xfs_check_acl" is #defined to NULL - which breaks the code attempting to add a tracepoint on this function. Only define the tracepoint when the function exists. Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 4d423a9210b0f279d843982044322b9542b8c2dc Author: Kulikov Vasiliy <segooon@xxxxxxxxx> Date: Tue Jul 20 17:54:28 2010 +1000 xfs: fix unsigned underflow in xfs_free_eofblocks map_len is unsigned. Checking map_len <= 0 is buggy when it should be below zero. So, check exact expression instead of map_len. Signed-off-by: Kulikov Vasiliy <segooon@xxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 6f6b39eb706f5617750cf02952e4e6d7470c40bf Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Jul 20 17:54:12 2010 +1000 xfs: use GFP_NOFS for page cache allocation Avoid a lockdep warning by preventing page cache allocation from recursing back into the filesystem during memory reclaim. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Alex Elder <aelder@xxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 6af128f6ede206995a6be91beb3358b3afbaa8a6 Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Jul 20 17:53:59 2010 +1000 xfs: fix memory reclaim recursion deadlock on locked inode buffer Calling into memory reclaim with a locked inode buffer can deadlock if memory reclaim tries to lock the inode buffer during inode teardown. Convert the relevant memory allocations to use KM_NOFS to avoid this deadlock condition. Reported-by: Peter Watkins <treestem@xxxxxxxxx> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Alex Elder <aelder@xxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 4b3ac9d54f2206e03fbfb30e113b05ff262a83e9 Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Jul 20 17:53:44 2010 +1000 xfs: fix xfs_trans_add_item() lockdep warnings xfs_trans_add_item() is called with ip->i_ilock held, which means it is unsafe for memory reclaim to recurse back into the filesystem (ilock is required in writeback). Hence the allocation needs to be KM_NOFS to avoid recursion. Lockdep report indicating memory allocation being called with the ip->i_ilock held is as follows: [ 1749.866796] ================================= [ 1749.867788] [ INFO: inconsistent lock state ] [ 1749.868327] 2.6.35-rc3-dgc+ #25 [ 1749.868741] --------------------------------- [ 1749.868741] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. [ 1749.868741] dd/2835 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 1749.868741] (&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff813170fb>] xfs_ilock+0x10b/0x190 [ 1749.868741] {IN-RECLAIM_FS-W} state was registered at: [ 1749.868741] [<ffffffff810b3a97>] __lock_acquire+0x437/0x1450 [ 1749.868741] [<ffffffff810b4b56>] lock_acquire+0xa6/0x160 [ 1749.868741] [<ffffffff810a20b5>] down_write_nested+0x65/0xb0 [ 1749.868741] [<ffffffff813170fb>] xfs_ilock+0x10b/0x190 [ 1749.868741] [<ffffffff8134e819>] xfs_reclaim_inode+0x99/0x310 [ 1749.868741] [<ffffffff8134f56b>] xfs_inode_ag_walk+0x8b/0x150 [ 1749.868741] [<ffffffff8134f6bb>] xfs_inode_ag_iterator+0x8b/0xf0 [ 1749.868741] [<ffffffff8134f7a8>] xfs_reclaim_inode_shrink+0x88/0x90 [ 1749.868741] [<ffffffff81119d07>] shrink_slab+0x137/0x1a0 [ 1749.868741] [<ffffffff8111bbe1>] balance_pgdat+0x421/0x6a0 [ 1749.868741] [<ffffffff8111bf7d>] kswapd+0x11d/0x320 [ 1749.868741] [<ffffffff8109ce56>] kthread+0x96/0xa0 [ 1749.868741] [<ffffffff81035de4>] kernel_thread_helper+0x4/0x10 [ 1749.868741] irq event stamp: 4234335 [ 1749.868741] hardirqs last enabled at (4234335): [<ffffffff81147d25>] kmem_cache_free+0x115/0x220 [ 1749.868741] hardirqs last disabled at (4234334): [<ffffffff81147c4d>] kmem_cache_free+0x3d/0x220 [ 1749.868741] softirqs last enabled at (4233112): [<ffffffff81084dd2>] __do_softirq+0x142/0x260 [ 1749.868741] softirqs last disabled at (4233095): [<ffffffff81035edc>] call_softirq+0x1c/0x50 [ 1749.868741] [ 1749.868741] other info that might help us debug this: [ 1749.868741] 2 locks held by dd/2835: [ 1749.868741] #0: (&(&ip->i_iolock)->mr_lock#2){+.+.+.}, at: [<ffffffff81316edd>] xfs_ilock_nowait+0xed/0x200 [ 1749.868741] #1: (&(&ip->i_lock)->mr_lock){++++?.}, at: [<ffffffff813170fb>] xfs_ilock+0x10b/0x190 [ 1749.868741] [ 1749.868741] stack backtrace: [ 1749.868741] Pid: 2835, comm: dd Not tainted 2.6.35-rc3-dgc+ #25 [ 1749.868741] Call Trace: [ 1749.868741] [<ffffffff810b1faa>] print_usage_bug+0x18a/0x190 [ 1749.868741] [<ffffffff8104264f>] ? save_stack_trace+0x2f/0x50 [ 1749.868741] [<ffffffff810b2400>] ? check_usage_backwards+0x0/0xf0 [ 1749.868741] [<ffffffff810b2f11>] mark_lock+0x331/0x400 [ 1749.868741] [<ffffffff810b3047>] mark_held_locks+0x67/0x90 [ 1749.868741] [<ffffffff810b3111>] lockdep_trace_alloc+0xa1/0xe0 [ 1749.868741] [<ffffffff81147419>] kmem_cache_alloc+0x39/0x1e0 [ 1749.868741] [<ffffffff8133f954>] kmem_zone_alloc+0x94/0xe0 [ 1749.868741] [<ffffffff8133f9be>] kmem_zone_zalloc+0x1e/0x50 [ 1749.868741] [<ffffffff81335f02>] xfs_trans_add_item+0x72/0xb0 [ 1749.868741] [<ffffffff81339e41>] xfs_trans_ijoin+0xa1/0xd0 [ 1749.868741] [<ffffffff81319f82>] xfs_itruncate_finish+0x312/0x5d0 [ 1749.868741] [<ffffffff8133cb87>] xfs_free_eofblocks+0x227/0x280 [ 1749.868741] [<ffffffff8133cd18>] xfs_release+0x138/0x190 [ 1749.868741] [<ffffffff813464c5>] xfs_file_release+0x15/0x20 [ 1749.868741] [<ffffffff81150ebf>] fput+0x13f/0x260 [ 1749.868741] [<ffffffff8114d8c2>] filp_close+0x52/0x80 [ 1749.868741] [<ffffffff8114d9a9>] sys_close+0xb9/0x120 [ 1749.868741] [<ffffffff81034ff2>] system_call_fastpath+0x16/0x1b Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Alex Elder <aelder@xxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit a42414455456af00c8040cce7cfc7df3344afbe5 Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Jul 20 17:53:25 2010 +1000 xfs: simplify and remove xfs_ireclaim xfs_ireclaim has to get and put te pag structure because it is only called with the inode to reclaim. The one caller of this function already has a reference on the pag and a pointer to is, so move the radix tree delete to the caller and remove xfs_ireclaim completely. This avoids a xfs_perag_get/put on every inode being reclaimed. The overhead was noticed in a bug report at: https://bugzilla.kernel.org/show_bug.cgi?id=16348 Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Reviewed-by: Alex Elder <aelder@xxxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 57bf7d895c6dcda45a2f0610870c81c36ff7bf34 Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Jul 20 17:52:59 2010 +1000 xfs: don't block on buffer read errors xfs_buf_read() fails to detect dispatch errors before attempting to wait on sychronous IO. If there was an error, it will get stuck forever, waiting for an I/O that was never started. Make sure the error is detected correctly. Further, such a failure can leave locked pages in the page cache which will cause a later operation to hang on the page. Ensure that we correctly process pages in the buffers when we get a dispatch error. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx> commit 8687ad50cbd11c52cb14e0d617b001b2bbc94b20 Merge: 1a0a6b973a9770bb6092110733aa541d3a331679 cd5b8f8755a89a57fc8c408d284b8b613f090345 Author: Dave Chinner <david@xxxxxxxxxxxxx> Date: Thu Jul 22 12:33:11 2010 +1000 Merge branch 'master' into for-2.6.36 commit 16fd5367370099b59d96e30bb7d9de8d419659f2 Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Jul 20 09:43:39 2010 +1000 xfs: track AGs with reclaimable inodes in per-ag radix tree https://bugzilla.kernel.org/show_bug.cgi?id=16348 When the filesystem grows to a large number of allocation groups, the summing of recalimable inodes gets expensive. In many cases, most AGs won't have any reclaimable inodes and so we are wasting CPU time aggregating over these AGs. This is particularly important for the inode shrinker that gets called frequently under memory pressure. To avoid the overhead, track AGs with reclaimable inodes in the per-ag radix tree so that we can find all the AGs with reclaimable inodes via a simple gang tag lookup. This involves setting the tag when the first reclaimable inode is tracked in the AG, and removing the tag when the last reclaimable inode is removed from the tree. Then the summation process becomes a loop walking the radix tree summing AGs with the reclaim tag set. This significantly reduces the overhead of scanning - a 6400 AG filesystea now only uses about 25% of a cpu in kswapd while slab reclaim progresses instead of being permanently stuck at 100% CPU and making little progress. Clean filesystems filesystems will see no overhead and the overhead only increases linearly with the number of dirty AGs. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> commit 70e60ce71516c3a9e882edb70a09f696a05961db Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Tue Jul 20 08:07:02 2010 +1000 xfs: convert inode shrinker to per-filesystem contexts Now the shrinker passes us a context, wire up a shrinker context per filesystem. This allows us to remove the global mount list and the locking problems that introduced. It also means that a shrinker call does not need to traverse clean filesystems before finding a filesystem with reclaimable inodes. This significantly reduces scanning overhead when lots of filesystems are present. Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> commit 7f8275d0d660c146de6ee3017e1e2e594c49e820 Author: Dave Chinner <dchinner@xxxxxxxxxx> Date: Mon Jul 19 14:56:17 2010 +1000 mm: add context argument to shrinker callback The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Christoph Hellwig <hch@xxxxxx> ----------------------------------------------------------------------- Summary of changes: fs/xfs/linux-2.6/xfs_aops.c | 162 +++++++++++++++++++++------------------ fs/xfs/linux-2.6/xfs_aops.h | 2 + fs/xfs/linux-2.6/xfs_buf.c | 34 +++++--- fs/xfs/linux-2.6/xfs_buf.h | 8 -- fs/xfs/linux-2.6/xfs_super.c | 12 +-- fs/xfs/linux-2.6/xfs_sync.c | 161 ++++++++++++++++++++++++++------------- fs/xfs/linux-2.6/xfs_sync.h | 2 - fs/xfs/linux-2.6/xfs_trace.h | 5 + fs/xfs/quota/xfs_qm.c | 7 +- fs/xfs/quota/xfs_qm_syscalls.c | 70 +++++++++++++----- fs/xfs/xfs_alloc.c | 10 +-- fs/xfs/xfs_bmap.c | 43 +++++------ fs/xfs/xfs_buf_item.c | 1 - fs/xfs/xfs_da_btree.c | 6 +- fs/xfs/xfs_dir2_block.c | 6 +- fs/xfs/xfs_filestream.c | 80 +++++++++++++++++++- fs/xfs/xfs_filestream.h | 82 -------------------- fs/xfs/xfs_iget.c | 56 +-------------- fs/xfs/xfs_inode.c | 25 +++---- fs/xfs/xfs_inode.h | 2 +- fs/xfs/xfs_inode_item.c | 18 ++--- fs/xfs/xfs_log.c | 2 - fs/xfs/xfs_mount.h | 4 +- fs/xfs/xfs_trans.c | 2 +- fs/xfs/xfs_utils.c | 83 -------------------- fs/xfs/xfs_utils.h | 1 - fs/xfs/xfs_vnodeops.c | 4 +- 27 files changed, 410 insertions(+), 478 deletions(-) hooks/post-receive -- XFS development tree _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs