Hi Ted,
On 12/30/21 02:37, Theodore Ts'o wrote:
On Tue, Dec 28, 2021 at 09:36:22PM +0100, Manfred Spraul wrote:
Hi,
with simulated power failures, I see a corrupted journal
[39056.200845] JBD2: journal transaction 6943 on loop0-8 is corrupt.
[39056.200851] EXT4-fs (loop0): error loading journal
This means that the journal replay found a commit which was *not* the
last commit, and which contained a CRC error. If it's the last commit
(e.g., there is no valid subsequent commit block), then it's possible
that the journal commit was never completed before the system crashed
--- e.g., it was an interrupted commit.
It is the last commit, there are no valid subsequent commit blocks.
The current failure model is simple: all blocks up to block <n> are
written, the blocks starting from <n+1> are discarded.
What I can't rule out, but I think this is not what I see:
The image is small (512 MB) and everything is in memory. Thus I would
not rule out that the whole journal is filled within less than one second.
The commit header contains h_commit_sec and h_commit_nsec, but from what
I see, do_one_pass() evaluates only h_commit_sec.
Your test is aborting the commit at various points in the write I/O
stream, so it should be simulating an interrupted commit (assuming
that it's not corrupting any I/O. So the jbd2 layer should have
understood it was the last commit in the journal, and been OK with the
checksum failure.
I think the jbd2 layer understood that it was the last commit - but it
nevertheless failed the recovery.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/jbd2/recovery.c?h=v5.16-rc7#n809
|
if (pass == PASS_SCAN &&
!jbd2_commit_block_csum_verify(journal,
bh->b_data)) {
chksum_error:
if (commit_time < last_trans_commit_time)
goto ignore_crc_mismatch;
info->end_transaction = next_commit_ID;
if (!jbd2_has_feature_async_commit(journal)) {
journal->j_failed_commit =
next_commit_ID;
brelse(bh);
break;
}
}
|
async_commit() is false.
journal->j_failed_commit is set, and thus after journal playback, a
failure is reported :-(
What I have done:
- I have removed the line journal->j_failed_commit = next_commit_ID,
then the image is mounted.
- I had added pr_info() lines, and this confirms that it starts with a
failure of jbd2_block_csum_verify().
dmesg with JBD2 debug output and additional pr_info lines:
>>>
[ 748.591348] next_commit_id increased: 6943.
[ 748.593397] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for
sequence ID 6943 at 4544/8192
[ 748.594322] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking
block 4544
[ 748.595879] fs/jbd2/recovery.c: (do_one_pass, 549): Found magic 5,
sequence 6943
[ 748.596800] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for
sequence ID 6943 at 4545/8192
[ 748.600073] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking
block 4545
[ 748.601398] fs/jbd2/recovery.c: (do_one_pass, 549): Found magic 1,
sequence 6943
[ 748.603481] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for
sequence ID 6943 at 4574/8192
[ 748.606238] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking
block 4574
[ 748.607241] fs/jbd2/recovery.c: (do_one_pass, 549): Found magic 2,
sequence 6943
[ 748.608248] commit_csum_verify error: provided 81be34bd.
[ 748.610199] direct block verify error.
[ 748.612327] chksum_error.
[ 748.617616] error line 2.
[ 748.622085] error line 3.
[ 748.623291] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for
sequence ID 6943 at 4575/8192
[ 748.624133] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking
block 4575
[ 748.625134] done, info->end_transaction is 6943.
[ 748.626966] done with update, info->end_transaction is 6943.
[ 748.629521] next_commit_id: initial value 6799.
[ 748.630836] fs/jbd2/recovery.c: (do_one_pass, 491): Starting recovery
pass 1
[ 748.631529] fs/jbd2/recovery.c: (do_one_pass, 517): Scanning for
sequence ID 6799 at 6734/8192
[ 748.633346] fs/jbd2/recovery.c: (do_one_pass, 526): JBD2: checking
block 6734
<<<
What do you think?
Is JBD2 too aggressive in declaring something a journal corruption?
What is definitively correct is that the failure is minor. fsck -f was
able to correct it.
Thus: What is your understanding:
If a mount command fails due to a journal corruption, should user space
first try a fsck -f before giving up?
--
Manfred
From 759062a21176c46a8fc1fa4d95e20c4450a10b9a Mon Sep 17 00:00:00 2001
From: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
Date: Thu, 30 Dec 2021 08:22:00 +0100
Subject: [PATCH] jbd2/recovery.c: Continue on csum failures for commit record
Noticed with simulated power failures, i.e. not on real hardware:
The failure model is:
- 512 bytes writes are atomic.
- Larger writes are not atomic.
- Everything is written in order.
Since the JBD2 block size can be larger than the block size of the
physical drive, it may happen that a (JBD2) block starts with the
expected magic/block type==JBD2_COMMIT_BLOCK/sequence number/
commit time, but nevertheless the csum verification fails because
jbd2_commit_block_csum_verify() calculates a checksum over
the complete JBD2 block.
Thus: Just end the scan on a csum failure.
Note: The change is most likely incomplete. There are probably
more situations where the code assumes that an incorrect csum
is always a corruption.
---
fs/jbd2/recovery.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 8ca3527189f8..f6d59bc204a1 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -709,6 +709,20 @@ static int do_one_pass(journal_t *journal,
/* How to differentiate between interrupted commit
* and journal corruption ?
*
+ * Assume: Physical block size 512 bytes,
+ * j->j_blocksize=1024
+ * If the 1st physical block of a commit block is
+ * written, then the correct magic/block type/
+ * sequence number/commit time will be there.
+ * If the 2nd block is not written, then the csum
+ * verification will fail, because the csum is
+ * calculated over the whole JBD2 block.
+ *
+ * Thus: Only only async_commit, n-th transaction fails
+ * csum check, (n+1)th transaction passes csum check,
+ * is a journal corruption. Everything else could be
+ * just an interrupted write.
+ *
* {nth transaction}
* Checksum Verification Failed
* |
@@ -717,7 +731,7 @@ static int do_one_pass(journal_t *journal,
* async_commit sync_commit
* | |
* | GO TO NEXT "Journal Corruption"
- * | TRANSACTION
+ * | TRANSACTION or "Interrupted Commit"
* |
* {(n+1)th transanction}
* |
@@ -806,8 +820,9 @@ static int do_one_pass(journal_t *journal,
info->end_transaction = next_commit_ID;
if (!jbd2_has_feature_async_commit(journal)) {
- journal->j_failed_commit =
- next_commit_ID;
+ /* Interrupted commit or corrupt
+ * journal. Assume interrupted commit.
+ */
brelse(bh);
break;
}
--
2.33.1
From 693bcf65312e202c15b2f291e1d105b5486cfeb9 Mon Sep 17 00:00:00 2001
From: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
Date: Thu, 30 Dec 2021 08:03:04 +0100
Subject: [PATCH 3/3] DEBUG patch: add printk to fs/jbd2/recovery.c
debug printouts.
---
fs/jbd2/journal.c | 2 +-
fs/jbd2/recovery.c | 13 +++++++++++++
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 35302bc192eb..58545fa74b9f 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -49,7 +49,7 @@
#include <asm/page.h>
#ifdef CONFIG_JBD2_DEBUG
-ushort jbd2_journal_enable_debug __read_mostly;
+ushort jbd2_journal_enable_debug __read_mostly = 99;
EXPORT_SYMBOL(jbd2_journal_enable_debug);
module_param_named(jbd2_debug, jbd2_journal_enable_debug, ushort, 0644);
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 8ca3527189f8..b66884786cb9 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -428,6 +428,9 @@ static int jbd2_commit_block_csum_verify(journal_t *j, void *buf)
h->h_chksum[0] = 0;
calculated = jbd2_chksum(j, j->j_csum_seed, buf, j->j_blocksize);
h->h_chksum[0] = provided;
+if (provided != cpu_to_be32(calculated)) {
+pr_info("commit_csum_verify error: provided %x.\n", provided);
+}
return provided == cpu_to_be32(calculated);
}
@@ -478,6 +481,7 @@ static int do_one_pass(journal_t *journal,
sb = journal->j_superblock;
next_commit_ID = be32_to_cpu(sb->s_sequence);
+pr_info("next_commit_id: initial value %u.\n", (unsigned int) next_commit_ID);
next_log_block = be32_to_cpu(sb->s_start);
first_commit_ID = next_commit_ID;
@@ -783,6 +787,7 @@ static int do_one_pass(journal_t *journal,
brelse(bh);
break;
}
+pr_info("crc32_sum %x found_csum %x.\n", crc32_sum, found_chksum);
/* Neither checksum match nor unused? */
if (!((crc32_sum == found_chksum &&
@@ -800,22 +805,28 @@ static int do_one_pass(journal_t *journal,
if (pass == PASS_SCAN &&
!jbd2_commit_block_csum_verify(journal,
bh->b_data)) {
+pr_info("direct block verify error.\n");
chksum_error:
+pr_info("chksum_error.\n");
if (commit_time < last_trans_commit_time)
goto ignore_crc_mismatch;
+pr_info("error line 2.\n");
info->end_transaction = next_commit_ID;
if (!jbd2_has_feature_async_commit(journal)) {
+pr_info("error line 3.\n");
journal->j_failed_commit =
next_commit_ID;
brelse(bh);
break;
}
+pr_info("error line 4.\n");
}
if (pass == PASS_SCAN)
last_trans_commit_time = commit_time;
brelse(bh);
next_commit_ID++;
+pr_info("next_commit_id increased: %u.\n", (unsigned int) next_commit_ID);
continue;
case JBD2_REVOKE_BLOCK:
@@ -859,10 +870,12 @@ static int do_one_pass(journal_t *journal,
* log. If the latter happened, then we know that the "current"
* transaction marks the end of the valid log.
*/
+pr_info("done, info->end_transaction is %ld.\n", (long)info->end_transaction);
if (pass == PASS_SCAN) {
if (!info->end_transaction)
info->end_transaction = next_commit_ID;
+pr_info("done with update, info->end_transaction is %ld.\n", (long)info->end_transaction);
} else {
/* It's really bad news if different passes end up at
* different places (but possible due to IO errors). */
--
2.33.1