Re: Recover from "journal entries X-Y missing! (replaying X-Z)", "IO error on writing btree."

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 2019/3/24 7:51 ??, Dennis Schridde wrote:
> On Donnerstag, 21. März 2019 12:04:02 CET Junhui Tang wrote:
>> I meet this bug and send a patch before, Please have a try with
>> this patch.
>> 
>> https://www.spinics.net/lists/linux-bcache/msg06555.html
>> 
>> From: Tang Junhui <tang.junhui.linux@xxxxxxxxx> Date: Wed, 12 Sep
>> 2018 04:42:14 +0800 Subject: [PATCH] bcache: fix failure in
>> journal relplay
>> 
>> journal replay failed with messages: Sep 10 19:10:43 ceph kernel:
>> bcache: error on bb379a64-e44e-4812-b91d-a5599871a3b1: bcache:
>> journal entries 2057493-2057567 missing! (replaying
>> 2057493-2076601), disabling caching
>> 
>> The reason is in journal_reclaim(), we send discard command and 
>> reclaim those journal buckets whose seq is old than the
>> last_seq_now, but before we write a journal with last_seq_now,
>> the machine is restarted, so the journal with the last_seq_now is
>> not written to the journal bucket, and the last_seq_wrote in the
>> newest journal is old than last_seq_now which we expect to be, so
>> when we doing replay, journals from last_seq_wrote to
>> last_seq_now are missing.
>> 
>> It's hard to write a journal immediately after
>> journal_reclaim(), and it harmless if those missed journal are
>> caused by discarding since those journals are already wrote to
>> btree node. So, if miss seqs are started from the beginning
>> journal, we treat it as normal, and only print a message to show
>> the miss journal, and point out it maybe caused by discarding.
>> 
>> Signed-off-by: Tang Junhui <tang.junhui.linux@xxxxxxxxx> --- 
>> drivers/md/bcache/journal.c | 8 ++++++-- 1 file changed, 6
>> insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/md/bcache/journal.c
>> b/drivers/md/bcache/journal.c index 10748c6..9b4cd2e 100644 ---
>> a/drivers/md/bcache/journal.c +++ b/drivers/md/bcache/journal.c 
>> @@ -328,9 +328,13 @@ int bch_journal_replay(struct cache_set *s, 
>> struct list_head *list) list_for_each_entry(i, list, list) { 
>> BUG_ON(i->pin && atomic_read(i->pin) != 1);
>> 
>> - cache_set_err_on(n != i->j.seq, s, -"bcache: journal entries
>> %llu-%llu missing! (replaying %llu-%llu)", + if (n != i->j.seq &&
>> n == start) + pr_info("bcache: journal entries %llu-%llu may be
>> discarded! (replaying %llu-%llu)", n, i->j.seq - 1, start, end); 
>> + else + cache_set_err_on(n != i->j.seq, s, +        "bcache:
>> journal entries %llu-%llu missing! (replaying %llu-%llu)", +
>> n, i->j.seq - 1, start, end);
>> 
>> for (k = i->j.start; k < bset_bkey_last(&i->j);
> 
> Hi!
> 
> Thanks a lot!  I patched Linux 5.0.2 with your patch (after
> cleaning up the whitespace to match the actual source) and was able
> to boot my machine with it, which cleaned up the bcache issue and
> allowed me to subsequently boot the machine using an unpatched
> kernel again.
Hi Dennis,

After Junhui posts an update version of this fix, can I add you as
Tested-by ?

Thanks.

Coly Li
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEE6j5FL/T5SGCN6PrQxzkHk2t9+PwFAlyXCk0ACgkQxzkHk2t9
+Pxr5BAAg8EosuOWWup+RvKGs5bUoJOYtB1psBxWUd+fVIemdNwl1brRan1IQv99
BwJ+ISlVjpbfn8HdGtsD+K9f4YyF/JD7rLIp68+TV5EZejQqKlZdR2hmO6NDUfgC
rpMW47mzX8kxommdogOrqG2A46EsPlzL/nLrdQBP2Q+aRxUToslawz+Jc6H+vjyX
8PyCVV6pGn1xwEZSSBw3zU9n5Ac8bvma4zGm4AoL8ccegt6952kfmSSr1ac9fAx7
M/rscnWsTvtBfM1IeFNdLa3OD/Ic08RzkQdshAMEkNO2ZLmthJ1NcKvFR6UMxrAa
yOIY/vcAWMnNyfoopYlx40nd9u5hPTSR2dSdYPNAIxGVS+ibgWHDCJbq9WU4/6Do
GsIYHOom7aDlSAiT+Lwupys/NcGcO7qfq7GuGdFB8z8lEUULJ6qEzetCcsDHPgfX
X/kgTU4472wQfchRJZQBqrDRSZDHJ9YY2iqTHuH1pv2tPMwflIYr6ZSXZGjX51u/
E5jHS5wJ4Q68+7FzCDJ6QimK7I6++U2u+UK7lCkwM8SI64HjdnRb13hmEMaqa/jt
2ytIfWixVz3Y6LGtFvTKIW9tWHrheHHn63Z/7c30PHSiE1lE1Jn71UsTiKOKSCiI
dp0swLwovUBJ7mQuSuLBDC2yKyKaTN0hrrBHTqMFau5njzHoc1w=
=YuL3
-----END PGP SIGNATURE-----



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux