Re: [PATCH v2] writeback: Do not sync data dirtied after sync start

Dave Chinner <david@xxxxxxxxxxxxx> · Sat, 28 Sep 2013 09:22:32 +1000



On Fri, Sep 27, 2013 at 11:37:45AM +0200, Jan Kara wrote:
> On Fri 27-09-13 10:55:53, Dave Chinner wrote:
> > On Thu, Sep 26, 2013 at 09:23:58PM +0200, Jan Kara wrote:
> > > When there are processes heavily creating small files while sync(2) is
> > > running, it can easily happen that quite some new files are created
> > > between WB_SYNC_NONE and WB_SYNC_ALL pass of sync(2). That can happen
> > > especially if there are several busy filesystems (remember that sync
> > > traverses filesystems sequentially and waits in WB_SYNC_ALL phase on one
> > > fs before starting it on another fs). Because WB_SYNC_ALL pass is slow
> > > (e.g. causes a transaction commit and cache flush for each inode in
> > > ext3), resulting sync(2) times are rather large.
....
> > > To give some numbers, when above script is run on two ext4 filesystems on
> > > simple SATA drive, the average sync time from 10 runs is 267.549 seconds
> > > with standard deviation 104.799426. With the patched kernel, the average
> > > sync time from 10 runs is 2.995 seconds with standard deviation 0.096.
> > 
> > Hmmmm. 2.8 seconds on my XFS perf VM without the patch. Ok, try a
> > smaller VM backed by single spindle of spinning rust rather than
> > SSDs. Over 10 runs I see:
> > 
> > kernel		min	max	av
> > vanilla		0.18s	4.46s	1.63s
> > patched		0.14s	0.45s	0.28s
> > 
> > Definitely an improvement, but nowhere near the numbers you are
> > seeing for ext4 - maybe XFS isn't as susceptible to this problem
> > as ext4. Nope, ext4 on an unpatched kernel gives 1.66/6.81/3.12s,
> > (which is less than your patched kernel results :) but means
> > so it must be something else configuration/hardware related.
>   Have you really used *two* (or more) busy filesystems? That makes the
> problem an order of magnitude worse for me. The numbers I've posted are for
> such situation...

I had to bump it up to 5 active filesystems before it fell off the
order of magnitude cliff. Still not the ~260s times you were seeing
- only about ~30s per sync - but you ar right in that there
definitely is a load point where things go really bad. Now I've found
that point, I can confirm that the patch fixes it :)

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html