On Wed, 2009-08-26 at 18:27 +0200, Jan Kara wrote: > > Our powerfail testing turned up an odd regression when using fsync() in > > no-journal mode to force data to the device. We saw loss rates (both > > file and data) that were much higher than the same test using ext2 (60+% > > loss versus <10%). We've done some investigation and one thing that > > stood out was that in the no-journal case, ext4_sync_file() was just > > calling sync_inode() (and nothing else), while ext2_sync_file(), for > > comparison, was also calling sync_mapping_buffers() to actually push the > > data out. > > > > I therefore hacked ext4_sync_file() to call sync_mapping_buffers() in > > the no-journal case; when we reran the test we saw that the loss rate > > dropped from 60+% to around 50%. While it's clear that we have more > > work to do in this area, this is a significant improvement. It appears > > that this was just missed when we did the no-journal work. Do you guys > > concur? > Well, I'm surprised sync_mapping_buffers() did anything - I believe > it's rather an error in testing. The thing is: sync_mapping_buffers() > writes buffers on private_list of mapping. In ext2, it contains all the > buffers used for indirect blocks. In ext4, there are no buffers there - > you have to call mark_buffer_dirty_inode() to put a buffer to this list > and ext4 does not do that with any buffer. So to make fsync work, you > have to call mark_buffer_dirty_inode() in __ext4_handle_dirty_metadata > if an inode is provided. Then sync_mapping_buffers() will actually do > something. Yeah, after digging further I realized that, but be that as it may, it did indeed make a 10% improvement overall. Why? No idea. In any event I'll keep digging as the basic problem is still there. > BTW: the syncing code in ext4_handle_dirty_metadata() looks > suboptimal. Why do you sync each an every metadata buffer? It might be > the easiest way for directories but for regular files this is really > superfluous. There you should need anything since VFS does the syncing > for you. Ah, you say "VFS" but what you really mean is "generic_file_xxx_write," correct? Basically, at the moment it's just doing in this case what ext2 does; it does sound like there's optimization that could be done here, however. > > The other interesting bit of this is that ext4 no-journal without using > > fsync() has, apparently, basically the same loss rate as ext2 with > > fsync(). > Isn't this the other way around? I suppose ext4 without fsync isn't > better than ext4 with fsync ;). That's what you would think, isn't it? However, you (and we) would be wrong. In our testing, ext4+fsync was significantly worse than ext4 without fsync. Like, six times worse. Yes, this is a nonintuitive result and no, I can't yet explain it. -- Frank Mayhar <fmayhar@xxxxxxxxxx> Google, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html