On Fri, May 15, 2009 at 10:25:08AM +0530, Aneesh Kumar K.V wrote: > > commit 039ed7a483fdcb2dbbc29f00cd0d74c101ab14c5 > > Author: Theodore Ts'o <tytso@xxxxxxx> > > Date: Thu May 14 17:09:37 2009 -0400 > > > > ext4: Fix race in ext4_inode_info.i_cached_extent > > > > If one CPU is reading from a file while another CPU is writing to the > > same file different locations, there is nothing protecting the > > i_cached_extent structure from being used and updated at the same > > time. This could potentially cause the wrong location on disk to be > > read or written to, including potentially causing the corruption of > > the block group descriptors and/or inode table. > > > It should be multiple readers. We don't allow read/write or multiple > writers via ext4_ext_get_blocks. &EXT4_I(inode)->i_data_sem is supposed > to protect read/write and multiple writers. What it allowed was > multiple readers(get_block call with create = 0). And readers did cache > the extent information which it read from the disk. So the fix is > correct, but we need to update the commit message. Yes, you're right. The i_data_sem would have protected us from the worst of these problems. The commit message isn't totally wrong, though, since even a writer will call ext4_ext_get_blocks() with create == 0. But we should describe the failure mode much more accurately, and say that it is two callers of ext4_ext_get_blocks() with create=0 racing against one another. It's possible, perhaps even likely, that one of those callers came from a call of ext4_get_blocks() with create=1 --- given that the people who were reporting this were doing so when doing backup jobs. However, one could imagine multiple mysql (or DB2) threads writing into random parts of a database, and if two CPU's try to bmap already allocated (and initialized) blocks out of the database file, so that ext4_ext_get_blocks() is always being called with create=0, you could very easily end up with a situation where blocks would be written to the wrong location on disk. <<shiver>> We should make sure that all of the enterprise distributions that are shipping ext4 as a "technology preview" get a copy of this patch as a fixing a high-priority bug, since a database workload should trip this pretty easily. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html