Re: nfs-backed mmap file results in 1000s of WRITEs per second

"Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> · Fri, 6 Sep 2013 15:00:56 +0000

On Fri, 2013-09-06 at 09:36 -0400, Jeff Layton wrote:
> On Thu, 5 Sep 2013 17:34:20 -0500
> Quentin Barnes <qbarnes@xxxxxxxxx> wrote:
> 
> > On Thu, Sep 05, 2013 at 09:57:24PM +0000, Myklebust, Trond wrote:
> > > On Thu, 2013-09-05 at 16:36 -0500, Quentin Barnes wrote:
> > > > On Thu, Sep 05, 2013 at 08:02:01PM +0000, Myklebust, Trond wrote:
> > > > > On Thu, 2013-09-05 at 14:11 -0500, Quentin Barnes wrote:
> > > > > > On Thu, Sep 05, 2013 at 12:03:03PM -0500, Malahal Naineni wrote:
> > > > > > > Neil Brown posted a patch couple days ago for this!
> > > > > > > 
> > > > > > > http://thread.gmane.org/gmane.linux.nfs/58473
> > > > > > 
> > > > > > I tried Neil's patch on a v3.11 kernel.  The rebuilt kernel still
> > > > > > exhibited the same 1000s of WRITEs/sec problem.
> > > > > > 
> > > > > > Any other ideas?
> > > > > 
> > > > > Yes. Please try the attached patch.
> > > > 
> > > > Great!  That did the trick!
> > > > 
> > > > Do you feel this patch could be worthy of pushing it upstream in its
> > > > current state or was it just to verify a theory?
> > > > 
> > > > 
> > > > In comparing the nfs_flush_incompatible() implementations between
> > > > RHEL5 and v3.11 (without your patch), the guts of the algorithm seem
> > > > more or less logically equivalent to me on whether or not to flush
> > > > the page.  Also, when and where nfs_flush_incompatible() is invoked
> > > > seems the same.  Would you provide a very brief pointer to clue me
> > > > in as to why this problem didn't also manifest circa 2.6.18 days?
> > > 
> > > There was no nfs_vm_page_mkwrite() to handle page faults in the 2.6.18
> > > days, and so the risk was that your mmapped writes could end up being
> > > sent with the wrong credentials.
> > 
> > Ah!  You're right that nfs_vm_page_mkwrite() was missing from
> > the original 2.6.18, so that makes sense, however, Red Hat had
> > backported that function starting with their RHEL5.9(*) kernels,
> > yet the problem doesn't manifest on RHEL5.9.  Maybe the answer lies
> > somewhere in RHEL5.9's do_wp_page(), or up that call path, but
> > glancing through it, it all looks pretty close though.
> > 
> > 
> > (*) That was the source I using when comparing with the 3.11 source
> > when studying your patch since it was the last kernel known to me
> > without the problem.
> > 
> 
> I'm pretty sure RHEL5 has a similar problem, but it's unclear to me why
> you're not seeing it there. I have a RHBZ open vs. RHEL5 but it's marked
> private at the moment (I'll see about opening it up). I brought this up
> upstream about a year ago with this strawman patch:
> 
>     http://article.gmane.org/gmane.linux.nfs/51240
> 
> ...at the time Trond said he was working on a set of patches to track
> the open/lock stateid on a per-req basis. Did that approach not pan
> out?

We've achieved what we wanted to do (Neil's lock recovery patch) without
that machinery, so for now, we're dropping that.

> Also, do you need to do a similar fix to nfs_can_coalesce_requests?

Yes. Good point!

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com
��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥