Re: [RFC][PATCH] Vector read/write support for NFS (DIO) client

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2011-04-15 at 13:33 -0400, Christoph Hellwig wrote:
> On Tue, Apr 12, 2011 at 11:49:29AM -0400, Trond Myklebust wrote:
> > Your approach goes in the direction of further special-casing O_DIRECT
> > in the NFS client. I'd like to move away from that and towards
> > integration with the ordinary read/write codepaths so that aside from
> > adding request coalescing, we can also enable pNFS support.
> 
> What is the exact plan?  Split the direct I/O into two passes, one
> to lock down the user pages and then a second one to send the pages
> over the wire, which is shared with the writeback code?  If that's
> the case it should naturally allow plugging in a scheme like Badari
> to send pages from different iovecs in a single on the wire request -
> after all page cache pages are non-continuous in virtual and physical
> memory, too.

You can't lock the user pages unfortunately: they may need to be faulted
in.

What I was thinking of doing is splitting out the code in the RPC
callbacks that plays around with page flags and puts the pages on the
inode's dirty list so that they don't get called in the case of
O_DIRECT.
I then want to attach the O_DIRECT pages to the nfsi->nfs_page_tree
radix tree so that they can be tracked by the NFS layer. I'm assuming
that nobody is going to be silly enough to require simultaneous writes
via O_DIRECT to the same locations.
Then we can feed the O_DIRECT pages into nfs_page_async_flush() so that
they share the existing page cache write coalescing and pnfs code.

The commit code will be easy to reuse too, since the requests are listed
in the radix tree and so nfs_scan_list() can find and process them in
the usual fashion.

The main problem that I have yet to figure out is what to do if the
server flags a reboot and the requests need to be resent. One option I'm
looking into is using the aio 'kick handler' to resubmit the writes.
Another may be to just resend directly from the nfsiod work queue.

> When do you plan to release your read/write code re-write?  If it's
> not anytime soon how is applying Badari's patch going to hurt?  Most
> of it probably will get reverted with a complete rewrite, but at least
> the logic to check which direct I/O iovecs can coalesced would stay
> in the new world order.

I'm hoping that I can do the rewrite fairly quickly once the resend
problem is solved. It shouldn't be more than a couple of weeks of
coding.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux