On Mon, 2002-12-02 at 07:17, Andrew Morton wrote: > Are you sure? I can't make it happen on 2.4.19. And disabling the new > BH_Freed logic (which went into 2.4.20-pre5) makes it go away. > > --- linux-akpm/fs/jbd/commit.c~a Sun Dec 1 23:10:12 2002 > +++ linux-akpm-akpm/fs/jbd/commit.c Sun Dec 1 23:10:27 2002 > @@ -695,7 +695,7 @@ skip_commit: /* The journal should be un > - clear_bit(BH_JBDDirty, &bh->b_state); > +// clear_bit(BH_JBDDirty, &bh->b_state); Argh. That's not the right fix --- it reintroduces the bug that BH_Freed was introduced to solve in the first place. The problem is that ext3 is expecting that truncate_inode_pages() (and hence ext3_flushpage) is only called during a truncate. That's what the function is named for, after all, and it's the hint we need to indicate that future writeback on the data we're discarding should be disabled (so that we don't get old data written on top of new data should the block get deallocated.) But kill_supers() eventually calls truncate_inode_pages() too when we're doing the invalidate_inodes(). And ext3 is reacting just the way it would for a normal truncate --- the data still gets written to the journal (correct, if we reboot before the truncate commits then the old data is preserved in the journal) but is not queued for writeback. The solution is to set BH_Freed in ext3_flushpage IFF we're being called from the truncate, but to avoid it if we're in an umount. I'm not sure of the best way to do that right now, but there are some trivial but hacky methods possible (eg. see if we're in a nested transaction; if so, it's a truncate, if not, it's a umount.) MS_ACTIVE might be a possible flag to test, but I'll need to double-check whether that is 100% safe --- we can't afford to skip the BH_Freed setting if we're in a truncate and the filesystem is not yet completely quiesced. Cheers, Stephen _______________________________________________ Ext3-users@redhat.com https://listman.redhat.com/mailman/listinfo/ext3-users