Hi Junhui, Now I am able to understand your patch. Yes this patch may fix one of the condition that jset get lost. We should have this fix in v5.1, I will handle the format issue. And if you don't mind I may re-compose a commit log to explain what exactly is fixed. Thanks. Coly Li On 2019/3/21 7:04 下午, Junhui Tang wrote: > I meet this bug and send a patch before, > Please have a try with this patch. > > https://www.spinics.net/lists/linux-bcache/msg06555.html > > From: Tang Junhui <tang.junhui.linux@xxxxxxxxx> > Date: Wed, 12 Sep 2018 04:42:14 +0800 > Subject: [PATCH] bcache: fix failure in journal relplay > > journal replay failed with messages: > Sep 10 19:10:43 ceph kernel: bcache: error on > bb379a64-e44e-4812-b91d-a5599871a3b1: bcache: journal entries > 2057493-2057567 missing! (replaying 2057493-2076601), disabling > caching > > The reason is in journal_reclaim(), we send discard command and > reclaim those journal buckets whose seq is old than the last_seq_now, > but before we write a journal with last_seq_now, the machine is > restarted, so the journal with the last_seq_now is not written to > the journal bucket, and the last_seq_wrote in the newest journal is > old than last_seq_now which we expect to be, so when we doing > replay, journals from last_seq_wrote to last_seq_now are missing. > > It's hard to write a journal immediately after journal_reclaim(), > and it harmless if those missed journal are caused by discarding > since those journals are already wrote to btree node. So, if miss > seqs are started from the beginning journal, we treat it as normal, > and only print a message to show the miss journal, and point out > it maybe caused by discarding. > > Signed-off-by: Tang Junhui <tang.junhui.linux@xxxxxxxxx> > --- > drivers/md/bcache/journal.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c > index 10748c6..9b4cd2e 100644 > --- a/drivers/md/bcache/journal.c > +++ b/drivers/md/bcache/journal.c > @@ -328,9 +328,13 @@ int bch_journal_replay(struct cache_set *s, > struct list_head *list) > list_for_each_entry(i, list, list) { > BUG_ON(i->pin && atomic_read(i->pin) != 1); > > - cache_set_err_on(n != i->j.seq, s, > -"bcache: journal entries %llu-%llu missing! (replaying %llu-%llu)", > + if (n != i->j.seq && n == start) > + pr_info("bcache: journal entries %llu-%llu may be discarded! > (replaying %llu-%llu)", > n, i->j.seq - 1, start, end); > + else > + cache_set_err_on(n != i->j.seq, s, > + "bcache: journal entries %llu-%llu missing! (replaying %llu-%llu)", > + n, i->j.seq - 1, start, end); > > for (k = i->j.start; > k < bset_bkey_last(&i->j); > -- > 1.8.3.1 > > > Coly Li <colyli@xxxxxxx <mailto:colyli@xxxxxxx>> 于2019年3月21日周四 下 > 午12:52写道: > > On 2019/3/21 3:33 上午, Dennis Schridde wrote: >> On Mittwoch, 20. März 2019 12:16:29 CET Coly Li wrote: >>> On 2019/3/20 5:42 上午, Dennis Schridde wrote: >>>> Hello! >>>> >>>> During boot my bcache device cannot be activated anymore and >>>> hence the filesystem content is inaccessible. It appears that >>>> parts of the journal are corrupted, since dmesg says: ``` >>>> bcache: register_bdev() registered backing device sda3 bcache: >>>> error on UUID: bcache: journal entries X-Y missing! (replaying >>>> X-Z) , disabling caching bcache: bch_count_io_errors() nvme0n1: >>>> IO error on writing btree. bcache: bch_btree_insert() error -5 >>>> bcache: bch_cached_dev_attach() Can't attach sda3: shutting >>>> down bcache: register_cache() registered cache device nvme0n1 >>>> bcache: bch_count_io_errors() nvme0n1: IO error on writing >>>> btree. bcache: bch_count_io_errors() nvme0n1: IO error on >>>> writing btree. bcache: bch_count_io_errors() nvme0n1: IO error >>>> on writing btree. bcache: bch_count_io_errors() nvme0n1: IO >>>> error on writing btree. bcache: bch_count_io_errors() nvme0n1: >>>> IO error on writing btree. bcache: bch_count_io_errors() >>>> nvme0n1: IO error on writing btree. bcache: cache_set_free() >>>> Cache set UUID unregistered ``` >>>> >>>> UUID represents a UUID. X, Y, Z are integers, with X<Y<Z, >>>> Y=X+12 and Z=Y+116. >>>> >>>> Error -5 is EIO, i.e. a generic I/O error. Is there a way to >>>> get more information on where that error originates from and >>>> what exactly is broken? Did bcache just detect broken data, or >>>> is the device itself broken? Which device, the HDD or the NVMe >>>> SSD? >>>> >>>> Is there a way to recover from this without loosing all data >>>> on the drive? Is it maybe possible to just discard the >>>> journal entries >X and return to the state the block device was >>>> at point X, loosing only modifications after that point? >>>> >>>> Background: The situation appeared after my computer was >>>> running for a few hours and the screen stayed dark when I tried >>>> to wake the monitor from standby. The machine did not react to >>>> NumLock or Ctrl+Alt+Entf, so I issued a magic SysRq and tried >>>> to safely reboot the machine by slowly typing REISUB. Sadly >>>> after this the machine ended up in the state described above. >>> >>> It seems some journal set was lost during bch_journal_replay() >>> after reboot and start cache set. >>> >>> During my test for a journal deadlock fix, I also observe this >>> issue. I change the journal buckets number from 256 to 8, such >>> problem can be observe almost every reboot. >>> >>> This one is not fixed yet and I am currently working on it. >>> >>> What kernel version do you use ? I though this issue was only >>> introduced by my current changes, but from your report it seems >>> such problem happens in upstream kernel as well. > >> I was using Linux 5.0.2 (with Gentoo patches, which are minimal, >> AFAIK). > >> I would have expected that S and/or U in REISUB would write all >> bcache metadata to disk and prevent such problems. Is this a wrong >> assumption? > >> Will your patches allow me to use the cache again, or will they >> prevent the metadata from breaking in the first place? > > Now I am still looking for the reason how such problem happens. Once I > have a fix, I will let you know. > > Thanks. > > Coly Li > > > -- Coly Li