On Wed, Oct 08, 2008 at 11:06:27AM -0500, Anthony Liguori wrote: > Daniel P. Berrange wrote: > >On Wed, Oct 08, 2008 at 01:15:46PM +0200, Chris Lalancette wrote: > >>Daniel P. Berrange wrote: > >>>QEMU defaults to allowing the host OS to cache all disk I/O. THis has a > >>>couple of problems > >>> > >>> - It is a waste of memory because the guest already caches I/O ops > >>> - It is unsafe on host OS crash - all unflushed guest I/O will be > >>> lost, and there's no ordering guarentees, so metadata updates could > >>> be flushe to disk, while the journal updates were not. Say goodbye > >>> to your filesystem. > >>> - It makes benchmarking more or less impossible / worthless because > >>> what the benchmark things are disk writes just sit around in memory > >>> so guest disk performance appears to exceed host diskperformance. > >>> > >>>This patch disables caching on all QEMU guests. NB, Xen has long done > >>>this > >>>for both PV & HVM guests - QEMU only gained this ability when -drive was > >>>introduced, and sadly kept the default to unsafe cache=on settings. > >>I'm for this in general, but I'm a little worried about the "performance > >>regression" aspect of this. People are going to upgrade to 0.4.7 (or > >>whatever), > >>and suddenly find that their KVM guests perform much more slowly. This is > >>better in the end for their data, but we might hear large complaints > >>about it. > > > >Yes & no. They will find their guests perform more consistently. With the > >current system their guests will perform very erratically depending on > >memory & I/O pressure on the host. If the host I/O cache is empty & has > >no I/O load, current guests will be "fast", > > They will perform marginally better than if cache=off. This is the > Linux host knows more about the underlying hardware than the guest and > is able to do smarter read-ahead. When using cache=off, the host cannot > perform any sort of read-ahead. > > >but if host I/O cache is full > >and they do something which requires more host memory (eg start up another > >guest), then all existing guests get their I/O performance trashed as the > >I/O cache has to be flushed out, and future I/O is unable to be cached. > > This is not accurate. Dirty pages in the host page cache are not > reclaimable until they're written to disk. If you're in a seriously low > memory situation, they the thing allocating memory is going to sleep > until the data is written to disk. If an existing guest is trying to do > I/O, then what things will degenerate to is basically cache=off since > the guest must wait for other pending IO to complete > > >Xen went through this same change and there were not any serious > >complaints, particularly when explained that previous system had > >zero data integrity guarentees. The current system merely provides an > >illusion of performance - any attempt to show that performance has > >decreased is impossible because any attempt to run benchmarks with > >existing caching just results in meaningless garbage. > > > >https://bugzilla.redhat.com/show_bug.cgi?id=444047 > > I can't see this bug, but a quick grep of ioemu in xen-unstable for > O_DIRECT reveals that they are not in fact using O_DIRECT. Sorry, it was mistakenly private - fixed now. Xen does use O_DIRECT for paravirt driver case - blktap is using the combo of AIO+O_DIRECT. QEMU code is only used for the IDE emulation case which isn't interesting from a performance POV. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list