On Thu, Mar 31, 2016 at 08:29:35AM -0600, Jens Axboe wrote: > On 03/31/2016 02:24 AM, Dave Chinner wrote: > >On Wed, Mar 30, 2016 at 09:07:48AM -0600, Jens Axboe wrote: > >>Hi, > >> > >>This patchset isn't as much a final solution, as it's demonstration > >>of what I believe is a huge issue. Since the dawn of time, our > >>background buffered writeback has sucked. When we do background > >>buffered writeback, it should have little impact on foreground > >>activity. That's the definition of background activity... But for as > >>long as I can remember, heavy buffered writers has not behaved like > >>that. For instance, if I do something like this: > >> > >>$ dd if=/dev/zero of=foo bs=1M count=10k > >> > >>on my laptop, and then try and start chrome, it basically won't start > >>before the buffered writeback is done. Or, for server oriented > >>workloads, where installation of a big RPM (or similar) adversely > >>impacts data base reads or sync writes. When that happens, I get people > >>yelling at me. > >> > >>Last time I posted this, I used flash storage as the example. But > >>this works equally well on rotating storage. Let's run a test case > >>that writes a lot. This test writes 50 files, each 100M, on XFS on > >>a regular hard drive. While this happens, we attempt to read > >>another file with fio. > >> > >>Writers: > >> > >>$ time (./write-files ; sync) > >>real 1m6.304s > >>user 0m0.020s > >>sys 0m12.210s > > > >Great. So a basic IO tests looks good - let's through something more > >complex at it. Say, a benchmark I've been using for years to stress > >the Io subsystem, the filesystem and memory reclaim all at the same > >time: a concurent fsmark inode creation test. > >(first google hit https://lkml.org/lkml/2013/9/10/46) > > Is that how you are invoking it as well same arguments? Yes. And the VM is exactly the same, too - 16p/16GB RAM. Cut down version of the script I use: #!/bin/bash QUOTA= MKFSOPTS= NFILES=100000 DEV=/dev/vdc LOGBSIZE=256k FSMARK=/home/dave/src/fs_mark-3.3/fs_mark MNT=/mnt/scratch while [ $# -gt 0 ]; do case "$1" in -q) QUOTA="uquota,gquota,pquota" ;; -N) NFILES=$2 ; shift ;; -d) DEV=$2 ; shift ;; -l) LOGBSIZE=$2; shift ;; --) shift ; break ;; esac shift done MKFSOPTS="$MKFSOPTS $*" echo QUOTA=$QUOTA echo MKFSOPTS=$MKFSOPTS echo DEV=$DEV sudo umount $MNT > /dev/null 2>&1 sudo mkfs.xfs -f $MKFSOPTS $DEV sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV $MNT sudo chmod 777 $MNT sudo sh -c "echo 1 > /proc/sys/fs/xfs/stats_clear" time $FSMARK -D 10000 -S0 -n $NFILES -s 0 -L 32 \ -d $MNT/0 -d $MNT/1 \ -d $MNT/2 -d $MNT/3 \ -d $MNT/4 -d $MNT/5 \ -d $MNT/6 -d $MNT/7 \ -d $MNT/8 -d $MNT/9 \ -d $MNT/10 -d $MNT/11 \ -d $MNT/12 -d $MNT/13 \ -d $MNT/14 -d $MNT/15 \ | tee >(stats --trim-outliers | tail -1 1>&2) sync sudo umount /mnt/scratch $ > >>The above was run without scsi-mq, and with using the deadline scheduler, > >>results with CFQ are similary depressing for this test. So IO scheduling > >>is in place for this test, it's not pure blk-mq without scheduling. > > > >virtio in guest, XFS direct IO -> no-op -> scsi in host. > > That has write back caching enabled on the guest, correct? No. It uses virtio,cache=none (that's the "XFS Direct IO" bit above). Sorry for not being clear about that. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html