Re: [PATCH 2/2] xfs: don't truncate prealloc from frequently accessed inodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2010-10-04 at 21:13 +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> A long standing problem for streaming writeÑ through the NFS server
> has been that the NFS server opens and closes file descriptors on an
> inode for every write. The result of this behaviour is that the
> ->release() function is called on every close and that results in
> XFS truncating speculative preallocation beyond the EOF.  This has
> an adverse effect on file layout when multiple files are being
> written at the same time - they interleave their extents and can
> result in severe fragmentation.
> 
> To avoid this problem, keep a count of the number of ->release calls
> made on an inode. For most cases, an inode is only going to be opened
> once for writing and then closed again during it's lifetime in
> cache. Hence if there are multiple ->release calls, there is a good
> chance that the inode is being accessed by the NFS server. Hence
> count up every time ->release is called while there are delalloc
> blocks still outstanding on the inode.
> 
> If this count is non-zero when ->release is next called, then do no
> truncate away the speculative preallocation - leave it there so that
> subsequent writes do not need to reallocate the delalloc space. This
> will prevent interleaving of extents of different inodes written
> concurrently to the same AG.
> 
> If we get this wrong, it is not a big deal as we truncate
> speculative allocation beyond EOF anyway in xfs_inactive() when the
> inode is thrown out of the cache.
> 
> The new counter in the struct xfs_inode fits into a hole in the
> structure on 64 bit machines, so does not grow the size of the inode
> at all.

This seems reasonable, and I have no real objection to
it.  However, I have a question and a comment related
to the affected code (and not your specific change).

> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> ---
>  fs/xfs/xfs_inode.h    |    1 +
>  fs/xfs/xfs_vnodeops.c |   15 ++++++++++++++-
>  2 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index 1594190..82aad5e 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -261,6 +261,7 @@ typedef struct xfs_inode {
>  	xfs_fsize_t		i_size;		/* in-memory size */
>  	xfs_fsize_t		i_new_size;	/* size when write completes */
>  	atomic_t		i_iocount;	/* outstanding I/O count */
> +	int			i_dirty_releases; /* dirty ->release calls */
>  
>  	/* VFS inode */
>  	struct inode		i_vnode;	/* embedded VFS inode */
> diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
> index b7bdc43..0c8eeba 100644
> --- a/fs/xfs/xfs_vnodeops.c
> +++ b/fs/xfs/xfs_vnodeops.c

OK, this comment is unrelated to your exact change.  But just above
the next hunk there's a big nasty condition, which appears to
be *almost* duplicated in xfs_inactive() (twice!).  It would be
very nice if, while you're at modifying this nearby code, you
could encapsulate that condition in a macro that has a meaningful
name.

> @@ -979,14 +979,27 @@ xfs_release(
>  			 * chance to drop them once the last reference to
>  			 * the inode is dropped, so we'll never leak blocks
>  			 * permanently.

I'm curious what the effect is if we simply don't do the truncate
*except* when the inode becomes inactive.  It means we hang onto
the stuff for a while longer, and maybe it makes things messier
in the event of a crash.  Can you tell me why we do the truncate
here as well as in xfs_inactive() (or what the problem is of
*not* doing it here)?

> +			 *
> +			 * Further, count the number of times we get here in
> +			 * the life of this inode. If the inode is being
> +			 * opened, written and closed frequently and we have
> +			 * delayed allocation blocks oustanding (e.g. streaming
> +			 * writes from the NFS server), truncating the
> +			 * blocks past EOF will cause fragmentation to occur.
> +			 * In this case don't do the truncation, either.
>  			 */
> +			if (ip->i_delayed_blks)
> +				ip->i_dirty_releases++;
> +			if (ip->i_dirty_releases > 1)
> +					goto out;
> +
>  			error = xfs_free_eofblocks(mp, ip,
>  						   XFS_FREE_EOF_TRYLOCK);
>  			if (error)
>  				return error;
>  		}
>  	}
> -
> +out:
>  	return 0;
>  }
>  



_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux