Re: I/O Scheduling results in poor responsiveness

Chris Snook <csnook@xxxxxxxxxx> · Wed, 05 Mar 2008 14:35:41 -0500

Nathan Grennan wrote:
    Why is the command below all that is needed to bring the system to 
it's knees? Why doesn't the io scheduler, CFQ, which is supposed to be 
all about fairness starve other processes? Example, if I open a new file 
in vim, and hold down "i" while this is running it will pause the 
display of new "i"s for seconds, sometimes until the dd write is 
completely finished. Another example is applications like firefox, 
thunderbird, xchat, and pidgin will stop refreshing for 10+ seconds.

 dd if=/dev/zero of=test-file bs=2M count=2048

 I understand the main difference between using oflag=direct or not 
relates to if the io scheduler is used, and if the file is cached or 
not. I can see this clearly by watching cached rise without 
oflag=direct, stay the same with it, and go way down when I delete the 
file after running dd without oflag=direct.

 The system in question is running Fedora 8. It is an E6600, 4gb memory, 
and 2x300gb Seagate sata drives. The drives are setup with md raid 1, 
and the filesystem is ext3. But I also see this with plenty of other 
systems with more cpu, less cpu, less memory, raid, and no raid.

 I have tried various tweaks to sys.vm settings, tried changing the 
scheduler to as or deadline. Nothing seem to get it to behave, other 
than oflag=direct.

 Using dd if=/dev/zero is just an easy test case.  I see this when 
copying large files, creating large files, and using virtualization 
software that does heavy i/o on large files.

 The command below seems to result in cpu idle 0 and io wait 100%. As 
shown by "vmstat 1"

dd if=/dev/zero of=test-file bs=2M count=2048

2048+0 records in
2048+0 records out
4294967296 bytes (4.3 GB) copied, 94.7903 s, 45.3 MB/s

 The command below seems to work much better for responsiveness. The cpu 
idle will be around 50, and the io wait will be around 50.

dd if=/dev/zero of=test-file2 bs=2M count=2048 oflag=direct

2048+0 records in
2048+0 records out
4294967296 bytes (4.3 GB) copied, 115.733 s, 37.1 MB/s

CFQ is optimized for throughput, not latency.  When you're doing dd without 
oflag=direct, you're dirtying memory faster than it can be written to disk, so 
pdflush will spawn up to 8 threads (giving it 8 threads' worth of CFQ time), 
which starves out vim's extremely frequent syncing of its session file.

The 8 threaded behavior of pdflush is a bit of a hack, and upstream is working 
on pageout improvements that should obviate it, but that work is still experimental.

vim's behavior is a performance/robustness tradeoff, and is expected to be slow 
when the system is doing a lot of I/O.

As for your virtualization, this is why most virtualization software (including 
Xen and KVM) allows you to use a block device, such as a logical volume, to 
which it can do direct I/O, which takes pdflush out of the picture.

Ultimately, if latency is a high priority for you, you should switch to the 
deadline scheduler.

-- Chris

--
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list