Re: Bad performance with XFS + 2.6.38 / 2.6.39

Dave Chinner <david@xxxxxxxxxxxxx> · Mon, 12 Dec 2011 12:00:53 +1100

On Mon, Dec 12, 2011 at 08:40:15AM +0800, Xupeng Yun wrote:
> On Mon, Dec 12, 2011 at 07:39, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> >
> > > ====== XFS + 2.6.29 ======
> >
> > Read 21GB @ 11k iops, 210MB/s, av latency of 1.3ms/IO
> > Wrote 2.3GB @ 1250 iops, 20MB/s, av latency of 0.27ms/IO
> > Total 1.5m IOs, 95% @ <= 2ms
> >
> > > ====== XFS + 2.6.39 ======
> >
> > Read 6.5GB @ 3.5k iops, 55MB/s, av latency of 4.5ms/IO
> > Wrote 700MB @ 386 iops, 6MB/s, av latency of 0.39ms/IO
> > Total 460k IOs, 95% @ <= 10ms, 4ms > 50% < 10ms
> >
> > Looking at the IO stats there, this doesn't look to me like an XFS
> > problem. The IO times are much, much longer on 2.6.39, so that's the
> > first thing to understand. If the two tests are doing identical IO
> > patterns, then I'd be looking at validating raw device performance
> > first.
> >
> 
> Thank you Dave.
> 
> I also did raw device and ext4 performance test with 2.6.39, all these
> tests are
> doing identical IO patterns(non-buffered IO, 16 IO threads, 16KB block size,
> mixed random read and write, r:w=9:1):
> ====== raw device + 2.6.39 ======
> Read 21.7GB @ 11.6k IOPS , 185MB/s, av latency of 1.37 ms/IO
> Wrote 2.4GB @ 1.3k IOPS, 20MB/s, av latency of 0.095 ms/IO
> Total 1.5M IOs, @ 96% <= 2ms
> 
> ====== ext4 + 2.6.39 ======
> Read 21.7GB @ 11.6k IOPS , 185MB/s, av latency of 1.37 ms/IO
> Wrote 2.4GB @ 1.3k IOPS, 20MB/s, av latency of 0.1 ms/IO
> Total 1.5M IOs, @ 96% <= 2ms
> 
> ====== XFS + 2.6.39 ======
> Read 6.5GB @ 3.5k iops, 55MB/s, av latency of 4.5ms/IO
> Wrote 700MB @ 386 iops, 6MB/s, av latency of 0.39ms/IO
> Total 460k IOs, @ 95% <= 10ms, 4ms > 50% < 10ms

Oh, of course, now I remember what the problem is - it's a locking
issue that was fixed in 3.0.11, 3.1.5 and 3.2-rc1.

commit 0c38a2512df272b14ef4238b476a2e4f70da1479
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Thu Aug 25 07:17:01 2011 +0000

    xfs: don't serialise direct IO reads on page cache checks

    There is no need to grab the i_mutex of the IO lock in exclusive
    mode if we don't need to invalidate the page cache. Taking these
    locks on every direct IO effective serialises them as taking the IO
    lock in exclusive mode has to wait for all shared holders to drop
    the lock. That only happens when IO is complete, so effective it
    prevents dispatch of concurrent direct IO reads to the same inode.

    Fix this by taking the IO lock shared to check the page cache state,
    and only then drop it and take the IO lock exclusively if there is
    work to be done. Hence for the normal direct IO case, no exclusive
    locking will occur.

    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Tested-by: Joern Engel <joern@xxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Alex Elder <aelder@xxxxxxx>

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs