On Sun, Oct 08, 2006 at 11:38:22PM +0200, Bodo Thiesen wrote: > BTW: When I talked about a transaction I > obviously meant something different than you, on the other hand that was my > fault. What I meant with transaction is something like an atom. Moving a > file from directory A to directory B needs (at least) four updates, the > inodes of the directories and the directory data blocks. I would say, that > this update is one transaction. But you would say, that is only a part of a > transaction, as you would put deletion of another file, writing some data > to an iso image and whatever else in the same transaction. So, just replace > my "transactions" by "transaction atoms", and then read again, what I > wrote, maybe that makes my idea more clearer. Ah, but that brings up the other problem; which is for a really big file, your "transaction atom" might not fit in a single "transaction". Remember, it's not just about keeping the inode, indirect block, double indirect, and triple indirect blocks up to date; it's also about all of those block allocation bitmaps; and for a big file, the number of block bitmaps you might have to touch can grow very large indeed. If the number of blocks that have to be touched during the unlink is larger than the space left for the journal, then we have to write a consistent snapshot of the inode, indirect, double indirect, and triple indirect blocks, plus all of the block bitmaps. And if you try to "restore" the blocks afterwards, that's potentially an extra block that needs to be journaled in the new transaction, and getting that all right is more than a little bit tricky. Now, the good news is that we are using bforget in journal_forget now, and that at least some of the time, restoring the i_blocks[] pointers will allow the inode to be recovered --- although if the unlink operation takes multiple transactions, you won't get the entire inode recovered that way. The bottom line is the interaction of truncate and journalling gets tricky, if you want it to be 100% reliable. If you're willing to settle for "mostly working", it's probably not that hard. - Ted _______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users