On Thu 25-04-24 14:45:15, Ye Bin wrote: > From: Ye Bin <yebin10@xxxxxxxxxx> > > We encountered a problem that the file system could not be mounted in > the power-off scenario. The analysis of the file system mirror shows that > only part of the data is written to the last commit block. > The valid data of the commit block is concentrated in the first sector. > However, the data of the entire block is involved in the checksum calculation. > For different hardware, the minimum atomic unit may be different. > If the checksum of a committed block is incorrect, clear the data except the > 'commit_header' and then calculate the checksum. If the checkusm is correct, > it is considered that the block is partially committed. However, if there are > valid description/revoke blocks, it is considered that the data is abnormal > and the log replay is stopped. > > Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx> The patch as a fix looks good. Feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> I'd just note that the maze of branches and gotos in do_one_pass() is becoming very hard to follow so ideally we should come up with some refactoring of the function so that it is easier to follow. But that's definitely something for a separate patch. Honza > --- > fs/jbd2/recovery.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 48 insertions(+) > > diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c > index 1f7664984d6e..594bf02a709f 100644 > --- a/fs/jbd2/recovery.c > +++ b/fs/jbd2/recovery.c > @@ -443,6 +443,27 @@ static int jbd2_commit_block_csum_verify(journal_t *j, void *buf) > return provided == cpu_to_be32(calculated); > } > > +static bool jbd2_commit_block_csum_verify_partial(journal_t *j, void *buf) > +{ > + struct commit_header *h; > + __be32 provided; > + __u32 calculated; > + void *tmpbuf; > + > + tmpbuf = kzalloc(j->j_blocksize, GFP_KERNEL); > + if (!tmpbuf) > + return false; > + > + memcpy(tmpbuf, buf, sizeof(struct commit_header)); > + h = tmpbuf; > + provided = h->h_chksum[0]; > + h->h_chksum[0] = 0; > + calculated = jbd2_chksum(j, j->j_csum_seed, tmpbuf, j->j_blocksize); > + kfree(tmpbuf); > + > + return provided == cpu_to_be32(calculated); > +} > + > static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag, > journal_block_tag3_t *tag3, > void *buf, __u32 sequence) > @@ -479,6 +500,7 @@ static int do_one_pass(journal_t *journal, > int descr_csum_size = 0; > int block_error = 0; > bool need_check_commit_time = false; > + bool has_partial_commit = false; > __u64 last_trans_commit_time = 0, commit_time; > > /* > @@ -590,6 +612,14 @@ static int do_one_pass(journal_t *journal, > next_log_block); > } > > + if (pass == PASS_SCAN && has_partial_commit) { > + pr_err("JBD2: Detect validate descriptor block %lu after incomplete commit block\n", > + next_log_block); > + err = -EFSBADCRC; > + brelse(bh); > + goto failed; > + } > + > /* If it is a valid descriptor block, replay it > * in pass REPLAY; if journal_checksums enabled, then > * calculate checksums in PASS_SCAN, otherwise, > @@ -810,6 +840,14 @@ static int do_one_pass(journal_t *journal, > if (pass == PASS_SCAN && > !jbd2_commit_block_csum_verify(journal, > bh->b_data)) { > + if (jbd2_commit_block_csum_verify_partial( > + journal, > + bh->b_data)) { > + pr_notice("JBD2: Find incomplete commit block in transaction %u block %lu\n", > + next_commit_ID, next_log_block); > + has_partial_commit = true; > + goto chksum_ok; > + } > chksum_error: > if (commit_time < last_trans_commit_time) > goto ignore_crc_mismatch; > @@ -824,6 +862,7 @@ static int do_one_pass(journal_t *journal, > } > } > if (pass == PASS_SCAN) { > + chksum_ok: > last_trans_commit_time = commit_time; > head_block = next_log_block; > } > @@ -843,6 +882,15 @@ static int do_one_pass(journal_t *journal, > next_log_block); > need_check_commit_time = true; > } > + > + if (pass == PASS_SCAN && has_partial_commit) { > + pr_err("JBD2: Detect validate revoke block %lu after incomplete commit block\n", > + next_log_block); > + err = -EFSBADCRC; > + brelse(bh); > + goto failed; > + } > + > /* If we aren't in the REVOKE pass, then we can > * just skip over this block. */ > if (pass != PASS_REVOKE) { > -- > 2.31.1 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR