On Mon, Apr 11, 2016 at 10:14:06PM +0800, Songbo Wang wrote: > Hi xfsers: > > I got some troubles on the performance of xfs. > The environment is , > xfs version is 3.2.1, > centos 7.1, > kernel version:3.10.0-229.el7.x86_64. > pcie-ssd card, > mkfs: mkfs.xfs /dev/hioa2 -f -n size=64k -i size=512 -d agcount=40 -l > size=1024m. > mount: mount /dev/hioa2 /mnt/ -t xfs -o > rw,noexec,nodev,noatime,nodiratime,nobarrier,discard,inode64,logbsize=256k,delaylog > I use the following command to test iops: fio -ioengine=libaio -bs=4k > -direct=1 -thread -rw=randwrite -size=50G -filename=/mnt/test -name="EBS > 4KB randwrite test" -iodepth=64 -runtime=60 > The results is normal at the beginning which is about 210k±,but some > seconds later, the results down to 19k±. Looks like the workload runs out of log space due to all the allocation transactions being logged, which then causes new transactions to start tail pushing the log to flush dirty metadata. This is needed to to make more space in the log for on incoming dio writes that require allocation transactions. This will block IO submission until there is space available in the log. Let's face it, all that test does is create a massively fragmented 50GB file, so you're going to have a lot of metadata to log. Do the maths - if it runs at 200kiops for a few seconds, it's created a million extents. And it's doing random insert on the extent btree, so it's repeatedly dirtying the entire extent btree. This will trigger journal commits quite frequently as this is a large amount of metadata that is being dirtied. e.g. at 500 extent records per 4k block, a million extents will require 2000 leaf blocks to store them all. That's 80MB of metadata per million extents that this workload is generating and repeatedly dirtying. Then there's also other metadata, like the free space btrees, that is also being repeatedly dirtied, etc, so it would not be unexpected to see a workload like this on high IOPS devices allocating 100MB of metadata every few seconds and the amount being journalled steadily increasing until the file is fully populated. > I did a senond test , > umount the /dev/hioa2, > fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite > -filename=/dev/hioa2 -name="EBS 8KB randwrite test" -iodepth=64 -runtime=60 > The results was normal, the iops is about 210k± all the time. That's not an equivalent test - it's being run direct to the block device, not to a file on the filesytem on the block device, and so you won't see artifacts taht are a result of creating worst case file fragmentation.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs