17.08.2010 18:28, Christoph Hellwig wrote: > On Tue, Aug 17, 2010 at 09:20:37AM -0500, Anthony Liguori wrote: [] >> For normal writes from a guest, we don't need to follow the write >> with an fsync(). We should only need to issue an fsync() given an >> explicit flush from the guest. > > Define normal writes. For cache=none and cache=writeback we don't > have to, and instead do explicit calls to fsync()/fdatasync() calls > when a we a cache flush from the guest. For data=writethrough we > guarantee data has made it to disk, and we implement this using > O_DSYNC/O_SYNC when opening the file. That tells the operating system > to not return until data has hit the disk. For Linux this is > internally implement using a range-fsync/fdatasync after the actual > write. And this is actually what I mentioned in the very beginning, in a hopefully-single-thread-email I've sent. Mentioned that ext4 is very slow when using with O_SYNC (without O_DIRECT). I still had no opportunity to collect more info on this, and yes, I've seen your (Christoph's) speed tests of a few FSes in the famous "BTRFS: Unbelievably slow with kvm/qemu" thread. A few users reported _insane_ write speeds of qcow2 files with default cache mode on ext4. And this is what prompted all this discussion (which actually has nothing to do with the $subject line ;), -- an attempt to think about replacing O_SYNC/fsync() with something "lighter"... >> fsync() being slow is orthogonal to my point. I don't see why we >> need to do an fsync() on *every* write. It should only be necessary >> when a guest injects an actual barrier. We don't do sync on every write, but O_SYNC implies that. And apparently it is what happening behind the scenes in ext4 O_SYNC case. But ok.... /mjt -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html