On Fri 25-01-19 20:40:23, zhangyi (F) wrote: > The jbd2 superblock is lockless now, so there is probably a race > condition between writing it so disk and modifing contents of it, which > may lead to checksum error. The following race is the one case that we > have captured. > > jbd2 fsstress > jbd2_journal_commit_transaction > jbd2_journal_update_sb_log_tail > jbd2_write_superblock > jbd2_superblock_csum_set jbd2_journal_revoke > jbd2_journal_set_features(revork) > modify superblock > submit_bh(checksum incorrect) > > One alternative fix is to lock the buffer everywhere that modifing it, > but it may expensive. So this patch introduce an in-memory jbd2 > superblock, update it to the on-disk superblock and calculate checksum > at write time. > > This checksum corruption problem can be reproduced by xfstests > generic/475. > > Signed-off-by: zhangyi (F) <yi.zhang@xxxxxxxxxx> Thanks for the analysis and the patch! I think that copying of the superblock to a temporary buffer is not free either and the frequent updates of journal superblock are synchronized by j_checkpoint_mutex anyway. So I think that using buffer lock when modifying journal superblock contents is actually the easiest way forward. Honza > --- > fs/jbd2/journal.c | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c > index 8ef6b6d..ad59909 100644 > --- a/fs/jbd2/journal.c > +++ b/fs/jbd2/journal.c > @@ -1133,6 +1133,7 @@ static journal_t *journal_init_common(struct block_device *bdev, > static struct lock_class_key jbd2_trans_commit_key; > journal_t *journal; > int err; > + journal_superblock_t *sb = NULL; > struct buffer_head *bh; > int n; > > @@ -1182,6 +1183,10 @@ static journal_t *journal_init_common(struct block_device *bdev, > if (!journal->j_wbuf) > goto err_cleanup; > > + sb = kzalloc(sizeof(*sb), GFP_KERNEL); > + if (!sb) > + goto err_cleanup; > + > bh = getblk_unmovable(journal->j_dev, start, journal->j_blocksize); > if (!bh) { > pr_err("%s: Cannot get buffer for journal superblock\n", > @@ -1189,11 +1194,12 @@ static journal_t *journal_init_common(struct block_device *bdev, > goto err_cleanup; > } > journal->j_sb_buffer = bh; > - journal->j_superblock = (journal_superblock_t *)bh->b_data; > + journal->j_superblock = sb; > > return journal; > > err_cleanup: > + kfree(sb); > kfree(journal->j_wbuf); > jbd2_journal_destroy_revoke(journal); > kfree(journal); > @@ -1360,6 +1366,7 @@ static int jbd2_write_superblock(journal_t *journal, int write_flags) > { > struct buffer_head *bh = journal->j_sb_buffer; > journal_superblock_t *sb = journal->j_superblock; > + journal_superblock_t *raw_sb = (journal_superblock_t *)bh->b_data; > int ret; > > trace_jbd2_write_superblock(journal, write_flags); > @@ -1381,7 +1388,8 @@ static int jbd2_write_superblock(journal_t *journal, int write_flags) > clear_buffer_write_io_error(bh); > set_buffer_uptodate(bh); > } > - jbd2_superblock_csum_set(journal, sb); > + memcpy(raw_sb, sb, sizeof(journal_superblock_t)); > + jbd2_superblock_csum_set(journal, raw_sb); > get_bh(bh); > bh->b_end_io = end_buffer_write_sync; > ret = submit_bh(REQ_OP_WRITE, write_flags, bh); > @@ -1507,7 +1515,7 @@ EXPORT_SYMBOL(jbd2_journal_update_sb_errno); > static int journal_get_superblock(journal_t *journal) > { > struct buffer_head *bh; > - journal_superblock_t *sb; > + journal_superblock_t *sb, *raw_sb; > int err = -EIO; > > bh = journal->j_sb_buffer; > @@ -1526,7 +1534,9 @@ static int journal_get_superblock(journal_t *journal) > if (buffer_verified(bh)) > return 0; > > + raw_sb = (journal_superblock_t *)bh->b_data; > sb = journal->j_superblock; > + memcpy(sb, raw_sb, sizeof(journal_superblock_t)); > > err = -EINVAL; > > @@ -1777,6 +1787,7 @@ int jbd2_journal_destroy(journal_t *journal) > jbd2_journal_destroy_revoke(journal); > if (journal->j_chksum_driver) > crypto_free_shash(journal->j_chksum_driver); > + kfree(journal->j_superblock); > kfree(journal->j_wbuf); > kfree(journal); > > -- > 2.7.4 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR