Re: [PATCH 3/3] pnfsblock: bail out unaligned DIO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/28/2012 05:30 AM, tao.peng@xxxxxxx wrote:

>> -----Original Message-----
>> From: Myklebust, Trond [mailto:Trond.Myklebust@xxxxxxxxxx]

<>

>> Also, why do you consider it to be direct i/o specific? If the
>> application is using byte range locking, and the locks aren't page/block
>> aligned then you are in the same position of having to deal with partial
>> page writes even in the read/write from page cache situation.


> You are right about byte range locking + buffered IO, and it should
> be fixed in pg_test with bellow patch and it could be a stable
> candidate. 


What?? please explain. It sounds like you are saying that there is a
very *serious* bug in current block-layout.

>From my experiment I know that lots and lots of IO is done none-paged
aligned even in buffered IO. Actually NFS goes to great length not to do
the usual read-modify-write per page, but keeps the byte range that
was written per page and only RPCs the exact offset-length of the modification.
Because by definition NFS is byte aligned IO, not "blocks" or "sectors".

Please explain what happens now. Is it a data corruption? Or just
performance slowness.

I don't understand. Don't you do the proper read-copy-modify-write that's
mandated by block layout RFC? byte aligned? And what are sectors and PAGES
got to do with it? I thought all IO must be "block" aligned.

In objects-layout we have even worse alignment constrains with raid5
(stripe_size alignment). It was needed to do a (very simple BTW)
read-modify-write. Involving not just partial pages but also full pages read.
BTW we read into the page-cache the surrounding pages, so not to read them multiple
times.

> It is different from DIO case because for DIO we have to
> be sure each page is blocksize aligned. And it can't easily be done
> in pg_test because in pg_test we only have one nfs_page to test
> against.
> 

<snip>

> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
> index 7ae8a60..a84a0da 100644
> --- a/fs/nfs/blocklayout/blocklayout.c
> +++ b/fs/nfs/blocklayout/blocklayout.c
> @@ -925,6 +925,18 @@ nfs4_blk_get_deviceinfo(struct nfs_server *server, const struct nfs_fh *fh,
>  	return rv;
>  }
>  
> +static bool
> +bl_pg_test(struct nfs_pageio_descriptor *pgio, struct nfs_page *prev,
> +	   struct nfs_page *req)
> +{
> +	/* Bail out page unligned IO */
> +	if (req->wb_offset || req->wb_pgbase ||
> +	    req->wb_bytes != PAGE_CACHE_SIZE)
> +		return false;
> +


This is very serious. Not many applications will currently pass
this test. (And hence will not do direct IO)

What happens today without this patch?

> +	return pnfs_generic_pg_test(pgio, prev, req);
> +}
> +


<>

Thanks
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux