> On Mon, 21 Jan 2008 22:02:12 -0500 "Theodore Ts'o" <tytso@xxxxxxx> wrote: > From: Girish Shilamkar <girish@xxxxxxxxxxxxx> > > The journal checksum feature adds two new flags i.e > JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT and JBD2_FEATURE_COMPAT_CHECKSUM. > > JBD2_FEATURE_CHECKSUM flag indicates that the commit block contains the > checksum for the blocks described by the descriptor blocks. > Due to checksums, writing of the commit record no longer needs to be > synchronous. Now commit record can be sent to disk without waiting for > descriptor blocks to be written to disk. This behavior is controlled > using JBD2_FEATURE_ASYNC_COMMIT flag. Older kernels/e2fsck should not be > able to recover the journal with _ASYNC_COMMIT hence it is made > incompat. > The commit header has been extended to hold the checksum along with the > type of the checksum. > > For recovery in pass scan checksums are verified to ensure the sanity > and completeness(in case of _ASYNC_COMMIT) of every transaction. > > ... > > +static inline __u32 jbd2_checksum_data(__u32 crc32_sum, struct buffer_head *bh) unneeded inlining. > +{ > + struct page *page = bh->b_page; > + char *addr; > + __u32 checksum; > + > + addr = kmap_atomic(page, KM_USER0); > + checksum = crc32_be(crc32_sum, > + (void *)(addr + offset_in_page(bh->b_data)), bh->b_size); > + kunmap_atomic(addr, KM_USER0); > + > + return checksum; > +} Can this buffer actually be in highmem? > static inline void write_tag_block(int tag_bytes, journal_block_tag_t *tag, > unsigned long long block) More unnecessary inlining. > +/* > + * jbd2_journal_clear_features () - Clear a given journal feature in the > + * superblock > + * @journal: Journal to act on. > + * @compat: bitmask of compatible features > + * @ro: bitmask of features that force read-only mount > + * @incompat: bitmask of incompatible features > + * > + * Clear a given journal feature as present on the > + * superblock. Returns true if the requested features could be reset. > + */ > +int jbd2_journal_clear_features(journal_t *journal, unsigned long compat, > + unsigned long ro, unsigned long incompat) > +{ > + journal_superblock_t *sb; > + > + jbd_debug(1, "Clear features 0x%lx/0x%lx/0x%lx\n", > + compat, ro, incompat); > + > + sb = journal->j_superblock; > + > + sb->s_feature_compat &= ~cpu_to_be32(compat); > + sb->s_feature_ro_compat &= ~cpu_to_be32(ro); > + sb->s_feature_incompat &= ~cpu_to_be32(incompat); > + > + return 1; > +} > +EXPORT_SYMBOL(jbd2_journal_clear_features); Kernel usually returns 0 on success. So we can return a useful errno on failure. > +/* > + * calc_chksums calculates the checksums for the blocks described in the > + * descriptor block. > + */ > +static int calc_chksums(journal_t *journal, struct buffer_head *bh, > + unsigned long *next_log_block, __u32 *crc32_sum) > +{ > + int i, num_blks, err; > + unsigned io_block; > + struct buffer_head *obh; > + > + num_blks = count_tags(journal, bh); > + /* Calculate checksum of the descriptor block. */ > + *crc32_sum = crc32_be(*crc32_sum, (void *)bh->b_data, bh->b_size); > + > + for (i = 0; i < num_blks; i++) { > + io_block = (*next_log_block)++; unsigned <- unsigned long. Are all the types appropriate in here? > + wrap(journal, *next_log_block); > + err = jread(&obh, journal, io_block); > + if (err) { > + printk(KERN_ERR "JBD: IO error %d recovering block " > + "%u in log\n", err, io_block); > + return 1; > + } else { > + *crc32_sum = crc32_be(*crc32_sum, (void *)obh->b_data, > + obh->b_size); > + } > + } > + return 0; > +} > + > static int do_one_pass(journal_t *journal, > struct recovery_info *info, enum passtype pass) > { > @@ -328,6 +360,7 @@ static int do_one_pass(journal_t *journal, > unsigned int sequence; > int blocktype; > int tag_bytes = journal_tag_bytes(journal); > + __u32 crc32_sum = ~0; /* Transactional Checksums */ > > /* Precompute the maximum metadata descriptors in a descriptor block */ > int MAX_BLOCKS_PER_DESC; > @@ -419,9 +452,23 @@ static int do_one_pass(journal_t *journal, > switch(blocktype) { > case JBD2_DESCRIPTOR_BLOCK: > /* If it is a valid descriptor block, replay it > - * in pass REPLAY; otherwise, just skip over the > - * blocks it describes. */ > + * in pass REPLAY; if journal_checksums enabled, then > + * calculate checksums in PASS_SCAN, otherwise, > + * just skip over the blocks it describes. */ > if (pass != PASS_REPLAY) { > + if (pass == PASS_SCAN && > + JBD2_HAS_COMPAT_FEATURE(journal, > + JBD2_FEATURE_COMPAT_CHECKSUM) && > + !info->end_transaction) { > + if (calc_chksums(journal, bh, > + &next_log_block, > + &crc32_sum)) { put_bh() > + brelse(bh); > + break; > + } > + brelse(bh); > + continue; put_bh() > + } > next_log_block += count_tags(journal, bh); > wrap(journal, next_log_block); > brelse(bh); > @@ -516,9 +563,96 @@ static int do_one_pass(journal_t *journal, > continue; > > + brelse(bh); etc - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html