Re: Bcache is not caching anything. cache state=inconsistent, how to clear?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think I have solved it after reading up on
https://www.kernel.org/doc/html/latest/admin-guide/bcache.html.

1. I've set caching to none.
2. I've detached the caching device
3. I've unregistered it
4. I've done wipe-fs
5. I've recreated bcache caching device (also used --cset-uuid to
already put it into the write bcache set)
6. I've registered and reattached the cache to the backing device
7. Now my backing device shows the status as clean again.
8. I've enabled writearound caching for now (will enable writeback if
all goes well)

It seems the cache is working again:

❯ bcache-status
--- bcache ---
Device                      /dev/sda (8:0)
UUID                        c9cd8259-3cee-42ff-a8ec-e11193c09b7e
Block Size                  0.50KiB
Bucket Size                 512.00KiB
Congested?                  False
Read Congestion             2.0ms
Write Congestion            20.0ms
Total Cache Size            173.97GiB
Total Cache Used            1.74GiB     (1%)
Total Cache Unused          172.23GiB   (99%)
Dirty Data                  0.50KiB     (0%)
Evictable Cache             173.97GiB   (100%)
Replacement Policy          [lru] fifo random
Cache Mode                  [writethrough] writeback writearound none
Total Hits                  2   (0%)
Total Misses                1506
Total Bypass Hits           0   (0%)
Total Bypass Misses         9138
Total Bypassed              183.70MiB

### Part 2: It's not over!

Soon after I've done this and all seemed to be well, bcache has
imploded once again, this time thankfully not taking down my root
filesystem. Probably because it was not in writeback mode.
My OS didn't boot, and I got another checksum error at some bucket and
"disabled caching" message.

I suspect it as due to my mistake - I have deleted and recreated
bcache cache without rebooting in the middle maybe something went
wrong because of that. I've rebooted into a live system, deleted the
cache again (my backing device was clean).

I've written all zeros to the partition before recreating the cache
this time though.
I suspect maybe bcache found old data there and got confused? Wipefs
only deletes superblocks.

Before doing anything though I've mounted my backing filesystem with
`mount -o 8192` and backed it up using a btrfs-clone Python script.
After I've verified my backup was working I've unmounted the backup
medium and proceeded to recreate the cache and reattach it.

I've also found that `running` was `0` fro my bcache set, so I have
turned it on.

After a reboot everything was back to normal.

I *hope* this will keep working. Last time Bcache broke and took my
filesystem with it without anything significant happening. I'd love to
know if it's considered stable or what could be causing spontaneous
failues.

- unfa



wt., 23 lis 2021 o 15:48 Tobiasz Karoń <unfa00@xxxxxxxxx> napisał(a):
>
> Hi!
>
> TL;DR
>
> My cache is inconsistent, and that's probably preventing Bcache for m
> using it (all I/O goes to the backing device). How can I clear that?
>
> Details:
>
> I've been using Bcache for the past few months on my root Btrfs
> filesystem with success.
> Then one day out of the blue Bcache failed and took my Btrfs
> filesystem with it (details:
> https://www.youtube.com/watch?v=Hf3zr6CxvmI, looks similar to this:
> https://stackoverflow.com/questions/22820492/how-to-revert-bcache-device-to-regular-device).
> That's not the topic of my message though.
> I've done a clean Arch Linux installation on Bcache + Btrfs once again
> using an SSD partition for cache and an HDD as the backing device.
>
> However, this time it doesn't do anything...
> I was unable to find any information online to solve this.
>
> My Bcache device works fine, the system boots off of it. However all
> I/O goes straight to the backing HDD, and the SSD is unused. Needless
> to say this means the performance is not what I got used to when
> Bcache was working fine.
>
> Here's what a 3rd party bcache-status script says (it'd be great if
> bcache-tools would provide something like this, BTW):
>
> ❯ bcache-status
> --- bcache ---
> Device                      ? (?)
> UUID                        c9cd8259-3cee-42ff-a8ec-e11193c09b7e
> Block Size                  0.50KiB
> Bucket Size                 512.00KiB
> Congested?                  False
> Read Congestion             2.0ms
> Write Congestion            20.0ms
> Total Cache Size            173.97GiB
> Total Cache Used            8.70GiB     (5%)
> Total Cache Unused          165.27GiB   (95%)
> Dirty Data                  0.50KiB     (0%)
> Evictable Cache             173.97GiB   (100%)
> Replacement Policy          [lru] fifo random
> Cache Mode                  (Unknown)
> Total Hits                  0
> Total Misses                0
> Total Bypass Hits           0
> Total Bypass Misses         0
> Total Bypassed              0B
>
> The Total Cache Used value has not changed since I've done my initial
> Arch Linux installation. It seems that Bcache has "turned off" by that
> point.
>
> Here's the bcache supers fro the backing device and cache
>
> ❯ bcache-super-show /dev/sda
> sb.magic                ok
> sb.first_sector         8 [match]
> sb.csum                 4E6EACCA74AB0AE5 [match]
> sb.version              1 [backing device]
>
> dev.label               unfa-desktop%20root
> dev.uuid                49202fdf-fbe5-48fd-bdd8-df5414da817c
> dev.sectors_per_block   8
> dev.sectors_per_bucket  1024
> dev.data.first_sector   16
> dev.data.cache_mode     0 [writethrough]
> dev.data.cache_state    3 [inconsistent]
>
> cset.uuid               9572380e-8e6f-4ce4-8323-80b98a85eeed
>
> ❯ bcache-super-show /dev/sdd3
> sb.magic                ok
> sb.first_sector         8 [match]
> sb.csum                 259C90FD74B4D4BE [match]
> sb.version              3 [cache device]
>
> dev.label               (empty)
> dev.uuid                95c6449a-03b5-40f2-a8cc-80b1b61c5ef0
> dev.sectors_per_block   1
> dev.sectors_per_bucket  1024
> dev.cache.first_sector  1024
> dev.cache.cache_sectors 364833792
> dev.cache.total_sectors 364834816
> dev.cache.ordered       yes
> dev.cache.discard       no
> dev.cache.pos           0
> dev.cache.replacement   0 [lru]
>
> cset.uuid               c9cd8259-3cee-42ff-a8ec-e11193c09b7e
>
> BTW - I've now realized I've set a label for the backing device but
> not the cache. maybe this is the reason? I don't think it should work
> this way but I've cleared the label on my backing device just to be
> sure.
>
> Hmm. The cache in inconsistent. I had this before I reinstalled my OS.
> I have recreated the bcache cache on the SSD and was hoping that will
> solve it.
> I don't know what I should do with this, is this the  reason why it's
> not working?
>
> I was wondering if washing the partition and recreating the cache
> would help, but I don't want to needlessly wear down the SSD if that
> won't help.
>
> Needless to say I would really like to avoid data loss when using
> Bcache - it's awesome, and the developer says it's perfectly stable
> and safe, but I've had a sudden failure and others had such as well
> (without seeing any hardware issues that could be causing that). Maybe
> I should quit using Bcache all together? Maybe it's not
> production-ready? I was wondering about maybe using Bcachefs, though
> the need to compile a custom kernel for it is quite a deterrent. I
> tried it briefly, but the bcachefs-tools stopped working at some point
> without a visible reason. I know Btrfs is flawed, though it seems to
> be the best so far.
>
> Thank you for your work,
> - unfa
>
> --
> - Tobiasz 'unfa' Karoń
>
> www.youtube.com/unfa000



--
- Tobiasz 'unfa' Karoń

www.youtube.com/unfa000




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux