On Mon, 2013-09-09 at 12:32 -0500, Quentin Barnes wrote: > On Mon, Sep 09, 2013 at 09:04:24AM -0400, Jeff Layton wrote: > > On Fri, 6 Sep 2013 11:48:45 -0500 > > Quentin Barnes <qbarnes@xxxxxxxxx> wrote: > > > > > Jeff, can your try out my test program in the base note on your > > > RHEL5.9 or later RHEL5.x kernels? > > > > > > I reverified that running the test on a 2.6.18-348.16.1.el5 x86_64 > > > kernel (latest released RHEL5.9) does not show the problem for me. > > > Based on what you and Trond have said in this thread though, I'm > > > really curious why it doesn't have the problem. > > > > I can confirm what you see on RHEL5. One difference is that RHEL5's > > page_mkwrite handler does not do wait_on_page_writeback. That was added > > as part of the stable pages work that went in a while back, so that may > > be the main difference. Adding that in doesn't seem to materially > > change things though. > > Good to know you confirmed the behavior I saw on RHEL5 (just so that > I know it's not some random variable in play I had overlooked). > > > In any case, what I see is that the initial program just ends up with a > > two calls to nfs_vm_page_mkwrite(). They both push out a WRITE and then > > things settle down (likely because the page is still marked dirty). > > > > Eventually, another write occurs and the dirty page gets pushed out to > > the server in a small flurry of WRITEs to the same range.Then, things > > settle down again until there's another small flurry of activity. > > > > My suspicion is that there is a race condition involved here, but I'm > > unclear on where it is. I'm not 100% convinced this is a bug, but page > > fault semantics aren't my strong suit. > > As a test on RHEL6, I made a trivial systemtap script for kprobing > nfs_vm_page_mkwrite() and nfs_flush_incompatible(). I wanted to > make sure this bug was limited to just the nfs module and was not a > result of some mm behavior change. > > With the bug unfixed running the test program, nfs_vm_page_mkwrite() > and nfs_flush_incompatible() are called repeatedly at a very high rate > (hence all the WRITEs). > > After Trond's patch, the two functions are called just at the > program's initialization and then called only every 30 seconds or > so. > > It looks like to me from the code flow that there must be something > nfs_wb_page() does that resets the need for mm to keeping reinvoking > nfs_vm_page_mkwrite(). I didn't look any deeper than that though > for now. Maybe a race in how nfs_wb_page() updates status you're > thinking of? In RHEL-5, nfs_wb_page() is just a wrapper to nfs_sync_inode_wait(), which does _not_ call clear_page_dirty_for_io() (and hence does not call page_mkclean()). That would explain it... -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥