On Friday 18 May 2007 15:03:46 Andreas Dilger wrote: > On May 18, 2007 09:48 -0400, Mats Ahlgren wrote: > > Namely, I'm confused: I would guess caching simply delays the time data gets > > to disk, and perhaps exacerbates data being written in not-the-order it was > > given? But, how could this cause a problem on a journaled filesystem? if one > > is (theoretically) only appending to the journal, checksumming/hashing to > > detect consistent journal entries on failure (since the last checkpoint), and > > only replaying consistent journal entries (which are idempotent)... then, > > assuming all those things above work, how could caching cause massive > > corruption of the directory tree? (Is the above an accurate model for ext3?) > > One issue is that we do not YET have journal checksumming in order to detect > the case where the commit block is written to the disk but not all of the > disk-cached blocks in the rest of that transaction are not yet committed. > That is where the big risk comes in for writeback cache in the device. Yikes... (that was my best guess for what was going on) > Ideally, the jbd layer could be notified when the transaction blocks are > flushed from device cache before writing the commit block, but the current > linux mechanism to do this (write barriers) sucks perforance-wise (it > sent throughput from 180MB/s to 7MB/s when enabled in our test systems). > It was better to just turn off write cache entirely than to use barriers. > > We have a patch for journal checksumming that is _right_ at the verge of > being ready for fixing the "commit-block before transaction blocks" problem. > In fact, in earlier testing it improved performance in some cases because > it allows the commit block to always be sent to disk at the same time as the > transaction blocks because we know the checksum will tell us if there were > any blocks not written to disk. Good to hear! It's a pity ext3 didn't have journal checksumming from its inception, but I'm glad you guys are fixing it. This seems like a serious problem for people who aren't aware of it. Sincerely, Mats > Girish, could you post your latest tested patch here for review? [snip] > > On Sunday 18 March 2007 09:33:59 Theodore Tso wrote: > > > It sounds like you have a disk which is doing very aggressive write > > > caching. If you are using a new enough kernel (2.6.9 or greater > > > should have this), adding "barrier=1" to your mount options should > > > help. We should probably make this the default at this point... > > > > > > - Ted > > Cheers, Andreas > -- > Andreas Dilger > Principal Software Engineer > Cluster File Systems, Inc. > > _______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users