> > So the cfq behavior is pretty undetermined. I more or less realize > > this from the experiments. For example, when starting 2+ "dd oflag=direct" > > tasks in one single cgroup, they _sometimes_ progress at different rates. > > See the attached graphs for two such examples on XFS. ext4 is fine. > > > > The 2-dd test case is: > > > > mkdir /cgroup/dd > > echo $$ > /cgroup/dd/tasks > > > > dd if=/dev/zero of=/fs/zero1 bs=1M oflag=direct & > > dd if=/dev/zero of=/fs/zero2 bs=1M oflag=direct & > > > > The 6-dd test case is similar. > Hum, interesting. I would not expect that. Maybe it's because files are > allocated at the different area of the disk. But even then the difference > should not be *that* big. Agreed. > > > > Look at this graph, the 4 dd tasks are granted the same weight (2 of > > > > them are buffered writes). I guess the 2 buffered dd tasks managed to > > > > progress much faster than the 2 direct dd tasks just because the async > > > > IOs are much more efficient than the bs=64k direct IOs. > > > Likely because 64k is too low to get good bandwidth with direct IO. If > > > it was 4M, I believe you would get similar throughput for buffered and > > > direct IO. So essentially you are right, small IO benefits from caching > > > effects since they allow you to submit larger requests to the device which > > > is more efficient. > > > > I didn't direct compare the effects, however here is an example of > > doing 1M, 64k, 4k direct writes in parallel. It _seems_ bs=1M only has > > marginal benefits of 64k, assuming cfq is behaving well. > > > > https://github.com/fengguang/io-controller-tests/raw/master/log/snb/ext4/direct-write-1M-64k-4k.2012-04-19-10-50/balance_dirty_pages-task-bw.png > > > > The test case is: > > > > # cgroup 1 > > echo 500 > /cgroup/cp/blkio.weight > > > > dd if=/dev/zero of=/fs/zero-1M bs=1M oflag=direct & > > > > # cgroup 2 > > echo 1000 > /cgroup/dd/blkio.weight > > > > dd if=/dev/zero of=/fs/zero-64k bs=64k oflag=direct & > > dd if=/dev/zero of=/fs/zero-4k bs=4k oflag=direct & > Um, I'm not completely sure what you tried to test in the above test. Yeah it's not a good test case. I've changed it to run the 3 dd tasks in 3 cgroups with equal weight. Attached the new results (looks the same as the original one). > What I wanted to point out is that direct IO is not necessarily less > efficient than buffered IO. Look: > xen-node0:~ # uname -a > Linux xen-node0 3.3.0-rc4-xen+ #6 SMP PREEMPT Tue Apr 17 06:48:08 UTC 2012 > x86_64 x86_64 x86_64 GNU/Linux > xen-node0:~ # dd if=/dev/zero of=/mnt/file bs=1M count=1024 conv=fsync > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 10.5304 s, 102 MB/s > xen-node0:~ # dd if=/dev/zero of=/mnt/file bs=1M count=1024 oflag=direct conv=fsync > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB) copied, 10.3678 s, 104 MB/s > > So both direct and buffered IO are about the same. Note that I used > conv=fsync flag to erase the effect that part of buffered write still > remains in the cache when dd is done writing which is unfair to direct > writer... OK, I also find direct write being a bit faster than buffered write: root@snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=1M count=1024 conv=fsync 1073741824 bytes (1.1 GB) copied, 10.4039 s, 103 MB/s 1073741824 bytes (1.1 GB) copied, 10.4143 s, 103 MB/s root@snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=1M count=1024 oflag=direct conv=fsync 1073741824 bytes (1.1 GB) copied, 9.9006 s, 108 MB/s 1073741824 bytes (1.1 GB) copied, 9.55173 s, 112 MB/s root@snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=64k count=16384 oflag=direct conv=fsync 1073741824 bytes (1.1 GB) copied, 9.83902 s, 109 MB/s 1073741824 bytes (1.1 GB) copied, 9.61725 s, 112 MB/s > And actually 64k vs 1M makes a big difference on my machine: > xen-node0:~ # dd if=/dev/zero of=/mnt/file bs=64k count=16384 oflag=direct conv=fsync > 16384+0 records in > 16384+0 records out > 1073741824 bytes (1.1 GB) copied, 19.3176 s, 55.6 MB/s Interestingly, my 64k direct writes are as fast as 1M direct writes... and 4k writes run at ~1/4 speed: root@snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=4k count=$((256<<10)) oflag=direct conv=fsync 1073741824 bytes (1.1 GB) copied, 42.0726 s, 25.5 MB/s Thanks, Fengguang
Attachment:
balance_dirty_pages-task-bw.png
Description: PNG image