-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2019/3/21 3:33 上午, Dennis Schridde wrote: > On Mittwoch, 20. März 2019 12:16:29 CET Coly Li wrote: >> On 2019/3/20 5:42 上午, Dennis Schridde wrote: >>> Hello! >>> >>> During boot my bcache device cannot be activated anymore and >>> hence the filesystem content is inaccessible. It appears that >>> parts of the journal are corrupted, since dmesg says: ``` >>> bcache: register_bdev() registered backing device sda3 bcache: >>> error on UUID: bcache: journal entries X-Y missing! (replaying >>> X-Z) , disabling caching bcache: bch_count_io_errors() nvme0n1: >>> IO error on writing btree. bcache: bch_btree_insert() error -5 >>> bcache: bch_cached_dev_attach() Can't attach sda3: shutting >>> down bcache: register_cache() registered cache device nvme0n1 >>> bcache: bch_count_io_errors() nvme0n1: IO error on writing >>> btree. bcache: bch_count_io_errors() nvme0n1: IO error on >>> writing btree. bcache: bch_count_io_errors() nvme0n1: IO error >>> on writing btree. bcache: bch_count_io_errors() nvme0n1: IO >>> error on writing btree. bcache: bch_count_io_errors() nvme0n1: >>> IO error on writing btree. bcache: bch_count_io_errors() >>> nvme0n1: IO error on writing btree. bcache: cache_set_free() >>> Cache set UUID unregistered ``` >>> >>> UUID represents a UUID. X, Y, Z are integers, with X<Y<Z, >>> Y=X+12 and Z=Y+116. >>> >>> Error -5 is EIO, i.e. a generic I/O error. Is there a way to >>> get more information on where that error originates from and >>> what exactly is broken? Did bcache just detect broken data, or >>> is the device itself broken? Which device, the HDD or the NVMe >>> SSD? >>> >>> Is there a way to recover from this without loosing all data >>> on the drive? Is it maybe possible to just discard the >>> journal entries >X and return to the state the block device was >>> at point X, loosing only modifications after that point? >>> >>> Background: The situation appeared after my computer was >>> running for a few hours and the screen stayed dark when I tried >>> to wake the monitor from standby. The machine did not react to >>> NumLock or Ctrl+Alt+Entf, so I issued a magic SysRq and tried >>> to safely reboot the machine by slowly typing REISUB. Sadly >>> after this the machine ended up in the state described above. >> >> It seems some journal set was lost during bch_journal_replay() >> after reboot and start cache set. >> >> During my test for a journal deadlock fix, I also observe this >> issue. I change the journal buckets number from 256 to 8, such >> problem can be observe almost every reboot. >> >> This one is not fixed yet and I am currently working on it. >> >> What kernel version do you use ? I though this issue was only >> introduced by my current changes, but from your report it seems >> such problem happens in upstream kernel as well. > > I was using Linux 5.0.2 (with Gentoo patches, which are minimal, > AFAIK). > > I would have expected that S and/or U in REISUB would write all > bcache metadata to disk and prevent such problems. Is this a wrong > assumption? > > Will your patches allow me to use the cache again, or will they > prevent the metadata from breaking in the first place? Now I am still looking for the reason how such problem happens. Once I have a fix, I will let you know. Thanks. Coly Li - -- Coly Li -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEE6j5FL/T5SGCN6PrQxzkHk2t9+PwFAlyTGEQACgkQxzkHk2t9 +Pwa2A//Xx3NcWEwxBkG9cYX8UjjLGilJO9PGtfC1U4CIKscFalohJ4f+28Vt/Qv Iqsbo87O+YqzidOI4L1StUOvCgMexg5A0GoUTFCZLI8G1p6tH6oUUKylmuiDSSiB ZmDZNzHJRPiC8y1fJwxbFoE1Jx1nvlwFaOJz7OSJfPoxEc8ZCHV/bOseD0xxtB/c +Pap1yXkS0uBvDNjh9gePMfgq5QYEA+9yrRSCDUPFR38MirrbSw+gGgEqiU3v7YN f6axuJBAwyDwGMXLG5iYqWUgMuIFWy7kjTJJsIDAZiPqojFgaWdStZM0ynnd2+3P OpmwipjUksy5L08ZwPvqW562JdvdjksijIrF9vxo+UYhZocbM2IX/8+YSUhFbjVs OfHHvUXhJs/argIJNGzw5QW18Uepi5J8WrJdONboMGsE1PS0zKQjsT0ToVuGG91t WH6fOwQtuhuPWt66APWE0/WT+J2wtvfIGRYNSEaNGbJSVC4S8pzPa0r+0kRnu361 gxOVYsMe8/Quc1r+1Y/z/2jEG1Cut67Ed1U64ebIMnrAXYW+8D4IaC4qKnPumYHx fgJKL4WAaj/9aVOaU6vyKX9GG83NoUjN6MxpCx2RPVdKOk3lv3Y4uzBDYtDoBjet Gq9E73720JiTL4qWWXyJHwxmwMNNwi9+POdjQAaUo4R1E9Mm+uA= =vCRz -----END PGP SIGNATURE-----