Re: NFS client growing system CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 28, 2011 at 12:58:35PM -0700, Simon Kirby wrote:

> On Tue, Sep 27, 2011 at 01:04:15PM -0400, Trond Myklebust wrote:
> 
> > On Tue, 2011-09-27 at 09:49 -0700, Simon Kirby wrote: 
> > > On Tue, Sep 27, 2011 at 07:42:53AM -0400, Trond Myklebust wrote:
> > > 
> > > > On Mon, 2011-09-26 at 17:39 -0700, Simon Kirby wrote: 
> > > > > Hello!
> > > > > 
> > > > > Following up on "System CPU increasing on idle 2.6.36", this issue is
> > > > > still happening even on 3.1-rc7. So, since it has been 9 months since I
> > > > > reported this, I figured I'd bisect this issue. The first bisection ended
> > > > > in an IPMI regression that looked like the problem, so I had to start
> > > > > again. Eventually, I got commit b80c3cb628f0ebc241b02e38dd028969fb8026a2
> > > > > which made it into 2.6.34-rc4.
> > > > > 
> > > > > With this commit, system CPU keeps rising as the log crunch box runs
> > > > > (reads log files via NFS and spews out HTML files into NFS-mounted report
> > > > > directories). When it finishes the daily run, the system time stays
> > > > > non-zero and continues to be higher and higher after each run, until the
> > > > > box never completes a run within a day due to all of the wasted cycles.
> > > > 
> > > > So reverting that commit fixes the problem on 3.1-rc7?
> > > > 
> > > > As far as I can see, doing so should be safe thanks to commit
> > > > 5547e8aac6f71505d621a612de2fca0dd988b439 (writeback: Update dirty flags
> > > > in two steps) which fixes the original problem at the VFS level.
> > > 
> > > Hmm, I went to git revert b80c3cb628f0ebc241b02e38dd028969fb8026a2, but
> > > for some reason git left the nfs_mark_request_dirty(req); line in
> > > nfs_writepage_setup(), even though the original commit had that. Is that
> > > OK or should I remove that as well?
> > > 
> > > Once that is sorted, I'll build it and let it run for a day and let you
> > > know. Thanks!
> > 
> > It shouldn't make any difference whether you leave it or remove it. The
> > resulting second call to __set_page_dirty_nobuffers() will always be a
> > no-op since the page will already be marked as dirty.
> 
> Ok, confirmed, git revert b80c3cb628f0ebc241b02e38dd028969fb8026a2 on
> 3.1-rc7 fixes the problem for me. Does this make sense, then, or do we
> need further investigation and/or testing?

Just to clear up what I said before, it seems that on plain 3.1-rc8, I am
actually able to clear the endless CPU use in nfs_writepages by just
running "sync". I am not sure when this changed, but I'm pretty sure that
some versions between 2.6.34 and 3.1-rc used to not be affected by just
"sync" unless it was paired with drop_caches. Maybe this makes the
problem more obvious...

Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux