Re: Link performance over NFS degraded in RHEL5. -- was : Read/Write NFS I/O performance degraded by FLUSH_STABLE page flushing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2009-06-05 at 12:05 -0400, J. Bruce Fields wrote:
> On Fri, Jun 05, 2009 at 09:57:19AM -0400, Steve Dickson wrote:
> > 
> > 
> > Trond Myklebust wrote:
> > > On Fri, 2009-06-05 at 09:30 -0400, Steve Dickson wrote:
> > >> Tom Talpey wrote:
> > >>> On 6/5/2009 7:35 AM, Steve Dickson wrote:
> > >>>> Brian R Cowan wrote:
> > >>>>> Trond Myklebust<trond.myklebust@xxxxxxxxxx>  wrote on 06/04/2009
> > >>>>> 02:04:58
> > >>>>> PM:
> > >>>>>
> > >>>>>> Did you try turning off write gathering on the server (i.e. add the
> > >>>>>> 'no_wdelay' export option)? As I said earlier, that forces a delay of
> > >>>>>> 10ms per RPC call, which might explain the FILE_SYNC slowness.
> > >>>>> Just tried it, this seems to be a very useful workaround as well. The
> > >>>>> FILE_SYNC write calls come back in about the same amount of time as the
> > >>>>> write+commit pairs... Speeds up building regardless of the network
> > >>>>> filesystem (ClearCase MVFS or straight NFS).
> > >>>> Does anybody had the history as to why 'no_wdelay' is an
> > >>>> export default?
> > >>> Because "wdelay" is a complete crock?
> > >>>
> > >>> Adding 10ms to every write RPC only helps if there's a steady
> > >>> single-file stream arriving at the server. In most other workloads
> > >>> it only slows things down.
> > >>>
> > >>> The better solution is to continue tuning the clients to issue
> > >>> writes in a more sequential and less all-or-nothing fashion.
> > >>> There are plenty of other less crock-ful things to do in the
> > >>> server, too.
> > >> Ok... So do you think removing it as a default would cause
> > >> any regressions?
> > > 
> > > It might for NFSv2 clients, since they don't have the option of using
> > > unstable writes. I'd therefore prefer a kernel solution that makes write
> > > gathering an NFSv2 only feature.
> > Sounds good to me! ;-)
> 
> Patch welcomed.--b.

Something like this ought to suffice...

-----------------------------------------------------------------------
From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
NFSD: Make sure that write gathering only applies to NFSv2

NFSv3 and above can use unstable writes whenever they are sending more
than one write, rather than relying on the flaky write gathering
heuristics. More often than not, write gathering is currently getting it
wrong when the NFSv3 clients are sending a single write with FILE_SYNC
for efficiency reasons.

This patch turns off write gathering for NFSv3/v4, and ensure that
it only applies to the one case that can actually benefit: namely NFSv2.

Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
---

 fs/nfsd/vfs.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)


diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index b660435..f30cc4e 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -975,6 +975,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
 	__be32			err = 0;
 	int			host_err;
 	int			stable = *stablep;
+	int			use_wgather;
 
 #ifdef MSNFS
 	err = nfserr_perm;
@@ -993,9 +994,10 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
 	 *  -	the sync export option has been set, or
 	 *  -	the client requested O_SYNC behavior (NFSv3 feature).
 	 *  -   The file system doesn't support fsync().
-	 * When gathered writes have been configured for this volume,
+	 * When NFSv2 gathered writes have been configured for this volume,
 	 * flushing the data to disk is handled separately below.
 	 */
+	use_wgather = (rqstp->rq_vers == 2) && EX_WGATHER(exp);
 
 	if (!file->f_op->fsync) {/* COMMIT3 cannot work */
 	       stable = 2;
@@ -1004,7 +1006,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
 
 	if (!EX_ISSYNC(exp))
 		stable = 0;
-	if (stable && !EX_WGATHER(exp)) {
+	if (stable && !use_wgather) {
 		spin_lock(&file->f_lock);
 		file->f_flags |= O_SYNC;
 		spin_unlock(&file->f_lock);
@@ -1040,7 +1042,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
 		 * nice and simple solution (IMHO), and it seems to
 		 * work:-)
 		 */
-		if (EX_WGATHER(exp)) {
+		if (use_wgather) {
 			if (atomic_read(&inode->i_writecount) > 1
 			    || (last_ino == inode->i_ino && last_dev == inode->i_sb->s_dev)) {
 				dprintk("nfsd: write defer %d\n", task_pid_nr(current));


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux