Re: [PATCH 3/3] pnfsblock: bail out unaligned DIO

"Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> · Sun, 27 May 2012 16:38:37 +0000

On Sun, 2012-05-27 at 13:33 +0800, Peng Tao wrote:
> Signed-off-by: Peng Tao <tao.peng@xxxxxxx>
> ---
>  fs/nfs/blocklayout/blocklayout.c |   20 ++++++++++++++++++++
>  1 files changed, 20 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
> index 53cb450..cdb87a9 100644
> --- a/fs/nfs/blocklayout/blocklayout.c
> +++ b/fs/nfs/blocklayout/blocklayout.c
> @@ -1000,7 +1000,27 @@ static bool bl_dio_begin(struct inode *inode, const struct iovec *iov,
>  			 unsigned long nr_segs, loff_t pos,
>  			 struct blk_plug *plug)
>  {
> +	unsigned blkmask = NFS_SERVER(inode)->pnfs_blksize - 1;
> +	size_t count;
> +	int seg;
> +	unsigned long addr;
> +
>  	blk_start_plug(plug);
> +
> +	/* Only allow blksized DIO for now.
> +	 * In theory we can handle page aligned DIO in current block layout
> +	 * read/write code, but it would require serialization between
> +	 * concurrent writers and it is far less effecient than just send IO
> +	 * to MDS.
> +	 */
> +	if (pos & blkmask)
> +		return false;
> +	for (seg = 0; seg < nr_segs; seg++) {
> +		addr = (unsigned long)iov[seg].iov_base;
> +		count = iov[seg].iov_len;
> +		if (unlikely((addr & blkmask) || (count & blkmask)))
> +			return false;
> +	}
>  	return true;
>  }

Again, this can and should go in the existing nfs_pageio_ops either in
the pg_init or in the pg_test.

Also, why do you consider it to be direct i/o specific? If the
application is using byte range locking, and the locks aren't page/block
aligned then you are in the same position of having to deal with partial
page writes even in the read/write from page cache situation.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com

��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥