On Wed, 2013-09-04 at 11:36 +1000, NeilBrown wrote: > Hi, > we have a customer who has noticed a significant regression in NFS > performance between SLES10 (2.6.16+) and SLES11 (3.0+). > > The use case involves two different processes on the same client mmapping a > file and making changes. > > In 2.6.16, all these changes would be cached on the client until a flush or > close or other normal process flushed the changes out - much the same as if > just one process was accessing the file. > > In 3.0 the file is flushed out every time a different process performs an > update. So if they interleave accesses, you get a flush-to-the-server on > every access. > > This is caused by nfs_flush_incompatible. It deems access through different > file descriptors to be incompatible. However I don't think this is > justified. Accessed using different credentials can reasonably be seen as > incompatible(*), but if two processes with identical credentials open the > same file there there should be not need to differentiate between them. > > The patch below changes the test in nfs_flush_incompatible to just test the > credentials (and the dentry, though that might be superfluous - it seems > safer). > > (*) In this specific customer's case the process do currently have slightly > different credentials. In one, the primary gid is included in the list of > auxiliary gids. In the other the primary gid is excluded. This is enough > to make the credentials appear different so this patch doesn't help in that > case. They are looking at being more consistent in their setgroups usage > but it would help them if we only actually compared the credentials that > were important. In this case there is no important difference. > Would it make sense to treat a more broad credential as compatible with a > more narrow credential? So a write from a process with a given > uid,gid,grouplist could ultimately be sent to the server with same uid/gid, > but a subset of the group list? > > > Thoughts? > > Thanks, > NeilBrown > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > index 7816801..b0d4e47 100644 > --- a/fs/nfs/write.c > +++ b/fs/nfs/write.c > @@ -860,7 +860,12 @@ int nfs_flush_incompatible(struct file *file, struct page *page) > if (req == NULL) > return 0; > l_ctx = req->wb_lock_context; > - do_flush = req->wb_page != page || req->wb_context != ctx; > + do_flush = req->wb_page != page; > + if (req->wb_context != ctx) { > + do_flush |= > + req->wb_context->dentry != ctx->dentry || > + req->wb_context->cred != ctx->cred; > + } > if (l_ctx) { > do_flush |= l_ctx->lockowner.l_owner != current->files > || l_ctx->lockowner.l_pid != current->tgid; Hi Neil, Won't this cause any write errors to be propagated to the wrong process? Cheers Trond -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥