On Aug 21, 2003 15:28 -0400, Erez Zadok wrote: > In message <20030821190811.GC1040@xxxxxxxxxxxxx>, Mike Fedyk writes: > > There's no need to support it in the kernel. The inode number is kept in > > the superblock, and that's updated at mkfs and tune2fs time, not from the > > kernel. > > > > Also, there isn't a second inode, it's just that the inode number is being > > kept in the superblocks too. > > How does the kernel know to write the journal data first to some data block > belonging to inode X, and then to another data block of inode Y? Both X and > Y are journal inodes, right? Will there be a reserved inum other than 8, > for the backup journal? > > Is there some magic in which the kernel can identify any number of special > journal inodes? > > And while we're at it, why only one backup journal inode? Why not several? > If it's good enough to have several copies of superblocks etc., then why not > the journal (for those willing to pay the performance penalty)? There are not, AFAICS, two copies of the journal being kept, which would require kernel changes and cause an even larger performance hit for ext3. Instead, the journal inode number is being kept in all of the backup superblocks (I don't think it was in the past). Secondly, there is a new "backup journal inode" (also kept in the superblock + backups), which I infer holds a duplicate of the blocks allocated to the journal. Having only the inode i_blocks field duplicated in a backup inode means that there is no (new) overhead writing to the journal, yet if the journal inode itself gets corrupted (very possible because it shares the same disk block with the root inode and is right at the beginning of the disk), we have a chance to recover the journal data. As a result, the journal itself will very likely have backups of recently-written blocks and can "self heal" from all sorts of nasty corruptions. What would also be needed (not sure if this is implemented or not) is that in the case of a corrupt superblock e2fsck assumes "needs_recovery" is set if "has_journal" is set and the (backup) journal inode can be read, so that the journal replay is actually done. That will almost always result in the primary superblock being restored from somewhere in the journal, along with other useful things like bitmaps and such. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ _______________________________________________ Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users