Re: [PATCH 4/8] xfs: log file size updates at I/O completion time

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 21 Feb 2012 16:05:26 +1100

On Mon, Feb 20, 2012 at 07:38:28PM -0500, Christoph Hellwig wrote:
> Do not use unlogged metadata updates and the VFS dirty bit for updating
> the file size after writeback.  In addition to causing various problems
> with updates getting delayed for far too long this also drags in the
> unscalable VFS dirty tracking, and is one of the few remaining unlogged
> metadata updates.
> 
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>

....
>  
> +STATIC int
> +xfs_setfilesize_trans_alloc(
> +	struct xfs_ioend	*ioend)
> +{

....
> +	/*
> +	 * We hand off the transaction to the completion thread now, so
> +	 * clear the flag here.
> +	 */
> +	current_restore_flags_nested(&tp->t_pflags, PF_FSTRANS);

......

> @@ -173,18 +214,32 @@ xfs_end_io(
>  	 * range to normal written extens after the data I/O has finished.
>  	 */
>  	if (ioend->io_type == IO_UNWRITTEN) {
> +		/*
> +		 * For buffered I/O we never preallocate a transaction when
> +		 * doing the unwritten extent conversion, but for direct I/O
> +		 * we do not know if we are converting and unwritten extent
> +		 * or not at the point where we preallocate the transaction.
> +		 */
> +		if (ioend->io_append_trans) {
> +			ASSERT(ioend->io_isdirect);
> +
> +			current_set_flags_nested(
> +				&ioend->io_append_trans->t_pflags, PF_FSTRANS);
> +			xfs_trans_cancel(ioend->io_append_trans, 0);

.....

> @@ -1280,17 +1340,33 @@ xfs_vm_direct_IO(
>  {
>  	struct inode		*inode = iocb->ki_filp->f_mapping->host;
>  	struct block_device	*bdev = xfs_find_bdev_for_inode(inode);
> +	struct xfs_ioend	*ioend = NULL;
>  	ssize_t			ret;
>  
>  	if (rw & WRITE) {
> -		iocb->private = xfs_alloc_ioend(inode, IO_DIRECT);
> +		size_t size = iov_length(iov, nr_segs);
> +
> +		/*
> +		 * Direct I/O code may have to convert unwritten extents from
> +		 * the AIO and I/O handler in interrupt context.  To make this
> +		 * possible we have to preallocate an ioend that allows defering
> +		 * it here.  For the case where we did not find an unwritten
> +		 * extent we'll free it again later.
> +		 */
> +		iocb->private = ioend = xfs_alloc_ioend(inode, IO_DIRECT);
> +		if (offset + size > XFS_I(inode)->i_d.di_size) {
> +			ret = xfs_setfilesize_trans_alloc(ioend);
> +			if (ret)
> +				goto destroy_ioend;
> +			ioend->io_isdirect = 1;
> +		}
>  
>  		ret = __blockdev_direct_IO(rw, iocb, inode, bdev, iov,
>  					    offset, nr_segs,
>  					    xfs_get_blocks_direct,
>  					    xfs_end_io_direct_write, NULL, 0);
>  		if (ret != -EIOCBQUEUED && iocb->private)
> -			xfs_destroy_ioend(iocb->private);
> +			goto destroy_ioend;
>  	} else {
>  		ret = __blockdev_direct_IO(rw, iocb, inode, bdev, iov,
>  					    offset, nr_segs,
> @@ -1299,6 +1375,12 @@ xfs_vm_direct_IO(
>  	}
>  
>  	return ret;
> +
> +destroy_ioend:
> +	if (ioend->io_append_trans)
> +		xfs_trans_cancel(ioend->io_append_trans, 0);

Hmmm, I missed something here first time through: do we need to
restore the PF_FSTRANS flag here before cancelling the transaction?
I don't think we do (clearing a flag that is aready cleared is not a
problem), but we should be consistent with the IO completion
handling of this transaction.

Other than that, everything looks good, so you can add:

Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>

When that inconsistency is fixed.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs