Hello Tobiasz! Am Di., 23. Nov. 2021 um 15:48 Uhr schrieb Tobiasz Karoń <unfa00@xxxxxxxxx>: > > Hi! > > TL;DR > > My cache is inconsistent, and that's probably preventing Bcache for m > using it (all I/O goes to the backing device). How can I clear that? I've had a similar problem after bcache crashed due to a bug in the latest kernel. I could resolve it by the following steps (I think you figure out what the PLACEHOLDERS mean): For each backend device, set the cache_mode to none and detach it: # echo none >/sys/block/BDEV/BPART/bcache/cache_mode # echo 1 >/sys/block/BDEV/BPART/bcache/detach Unregister the cache and re-create it (4096 works around the kernel bug, also, it's potentially broken, so re-create): # echo 1 >/sys/fs/bcache/CSETUUID/unregister # bcache make -C -w 4096 -l LABEL --force /dev/BPART Re-attach the devices and set cache mode: # echo NEW_CSETUUID >/sys/block/BDEV/BPART/bcache/attach # echo writearound >/sys/block/BDEV/BPART/bcache/cache_mode I'm explicitly using writearound for btrfs because: * writethrough would write data potentially relocated by COW * writeback potentially destroys btrfs on unexpected bcache failures * the performance difference between writeback and writearound for btrfs is virtually non-existent However, writearound will cache only reads, that means boot-time improvements will lag one boot behind: During the first boot, bcache will read btrfs and cache the reads, on the next boot, it will read the cached data. Using writethrough could work around that but that's not really useful with a COW filesystem because btrfs relocated extents on each and every tiny write - making any cached data stale and thus occupy bcache space for no reason. So it will also amplify writes to the SSD for no real reason. Youtube: The problem you see and documented is exactly what happened to me (but on Gentoo: system froze, reboot hung, rescue disk said: cache disabled with a similar message), and you can work around it by using blocksize 4096 - and in any case it still happens: Do NOT use writeback caching, use writearound as mentioned above, then at least it won't destroy btrfs and it's a matter of re-creating the cache as outlined above. HTH Kai > Details: > > I've been using Bcache for the past few months on my root Btrfs > filesystem with success. > Then one day out of the blue Bcache failed and took my Btrfs > filesystem with it (details: > https://www.youtube.com/watch?v=Hf3zr6CxvmI, looks similar to this: > https://stackoverflow.com/questions/22820492/how-to-revert-bcache-device-to-regular-device). > That's not the topic of my message though. > I've done a clean Arch Linux installation on Bcache + Btrfs once again > using an SSD partition for cache and an HDD as the backing device. > > However, this time it doesn't do anything... > I was unable to find any information online to solve this. > > My Bcache device works fine, the system boots off of it. However all > I/O goes straight to the backing HDD, and the SSD is unused. Needless > to say this means the performance is not what I got used to when > Bcache was working fine. > > Here's what a 3rd party bcache-status script says (it'd be great if > bcache-tools would provide something like this, BTW): > > ❯ bcache-status > --- bcache --- > Device ? (?) > UUID c9cd8259-3cee-42ff-a8ec-e11193c09b7e > Block Size 0.50KiB > Bucket Size 512.00KiB > Congested? False > Read Congestion 2.0ms > Write Congestion 20.0ms > Total Cache Size 173.97GiB > Total Cache Used 8.70GiB (5%) > Total Cache Unused 165.27GiB (95%) > Dirty Data 0.50KiB (0%) > Evictable Cache 173.97GiB (100%) > Replacement Policy [lru] fifo random > Cache Mode (Unknown) > Total Hits 0 > Total Misses 0 > Total Bypass Hits 0 > Total Bypass Misses 0 > Total Bypassed 0B > > The Total Cache Used value has not changed since I've done my initial > Arch Linux installation. It seems that Bcache has "turned off" by that > point. > > Here's the bcache supers fro the backing device and cache > > ❯ bcache-super-show /dev/sda > sb.magic ok > sb.first_sector 8 [match] > sb.csum 4E6EACCA74AB0AE5 [match] > sb.version 1 [backing device] > > dev.label unfa-desktop%20root > dev.uuid 49202fdf-fbe5-48fd-bdd8-df5414da817c > dev.sectors_per_block 8 > dev.sectors_per_bucket 1024 > dev.data.first_sector 16 > dev.data.cache_mode 0 [writethrough] > dev.data.cache_state 3 [inconsistent] > > cset.uuid 9572380e-8e6f-4ce4-8323-80b98a85eeed > > ❯ bcache-super-show /dev/sdd3 > sb.magic ok > sb.first_sector 8 [match] > sb.csum 259C90FD74B4D4BE [match] > sb.version 3 [cache device] > > dev.label (empty) > dev.uuid 95c6449a-03b5-40f2-a8cc-80b1b61c5ef0 > dev.sectors_per_block 1 > dev.sectors_per_bucket 1024 > dev.cache.first_sector 1024 > dev.cache.cache_sectors 364833792 > dev.cache.total_sectors 364834816 > dev.cache.ordered yes > dev.cache.discard no > dev.cache.pos 0 > dev.cache.replacement 0 [lru] > > cset.uuid c9cd8259-3cee-42ff-a8ec-e11193c09b7e > > BTW - I've now realized I've set a label for the backing device but > not the cache. maybe this is the reason? I don't think it should work > this way but I've cleared the label on my backing device just to be > sure. > > Hmm. The cache in inconsistent. I had this before I reinstalled my OS. > I have recreated the bcache cache on the SSD and was hoping that will > solve it. > I don't know what I should do with this, is this the reason why it's > not working? > > I was wondering if washing the partition and recreating the cache > would help, but I don't want to needlessly wear down the SSD if that > won't help. > > Needless to say I would really like to avoid data loss when using > Bcache - it's awesome, and the developer says it's perfectly stable > and safe, but I've had a sudden failure and others had such as well > (without seeing any hardware issues that could be causing that). Maybe > I should quit using Bcache all together? Maybe it's not > production-ready? I was wondering about maybe using Bcachefs, though > the need to compile a custom kernel for it is quite a deterrent. I > tried it briefly, but the bcachefs-tools stopped working at some point > without a visible reason. I know Btrfs is flawed, though it seems to > be the best so far. > > Thank you for your work, > - unfa > > -- > - Tobiasz 'unfa' Karoń > > www.youtube.com/unfa000