Re: bcache fails after reboot if discard is enabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> It works perfectly fine here with latest 3.18. My setup is backing a btrfs
> filesystem in write-back mode. I can reboot cleanly, hard-reset upon
> freezes, I had no issues yet and no data loss. Even after hard-reset the
> kernel logs of both bcache and btrfs were clean, the filesystem was clean,
> just the usual btrfs recovery messages after an unclean shutdown.
>
> I wonder if the SSD and/or the block layer in use may be part of the
> problem:
>
>   * if putting bcache on LVM, discards may not be handled well
>   * if putting bcache or the backing fs on LVM, barriers may not be handled
>     well (bcache relies on perfectly working barriers)
>   * does the SSD support powerloss protection? (IOW, use capacitors)
>   * latest firmware applied? read the changelogs of it?
>
> I'd try to first figure out these differences before looking further into
> debugging. I guess that most consumer-grade drives at least lack a few of
> the important features to use write-back mode, or use bcache at all.
>
> So, to start the list: My SSD is a Crucial MX100 128GB with discards enabled
> (for both bcache and btrfs), using plain raw devices (no LVM or MD
> involved). It supports TRIM (as my chipset does), and it supports powerloss-
> protection and maybe even some internal RAID-like data protection layer
> (whatever that is, it's in the papers).
>
> I'm not sure what a hard-reset technically means to the SSD but I guess it
> is handled as some sort of short powerloss. Reading through different SSD
> firmware update descriptions, I also see a lot words around power-off and
> reset problems being fixed that could lead to data-loss otherwise. That
> could be pretty fatal to bcache as it considers it storage as always unclean
> (probably even in write-through mode). Having damaged data blocks out of
> expected write order (barriers!) could be pretty bad when bcache recovers
> from last shutdown and replays logs.

Samsung 840-EVO 256GB here, running 4.0-rc7 (was 3.18)

There's no known issues with TRIM on an 840-EVO, and no powerloss or
anything of the sort occurred.  I was seeing excessive write
amplification on my SSD, and enabled discard - then my machine
promptly started lagging, eventually disk access locked up and after a
reboot I was confronted with:

[  276.558692] bcache: journal_read_bucket() 157: too big, 552 bytes,
offset 2047
[  276.571448] bcache: prio_read() bad csum reading priorities
[  276.571528] bcache: prio_read() bad magic reading priorities
[  276.576807] bcache: error on 804d6906-fa80-40ac-9081-a71a4d595378:
bad btree header at bucket 65638, block 0, 0 keys, disabling caching
[  276.577457] bcache: register_cache() registered cache device sda4
[  276.577632] bcache: cache_set_free() Cache set
804d6906-fa80-40ac-9081-a71a4d595378 unregistered

Attempting to check the backingstore (echo 1 > bcache/running):

[  687.912987] BTRFS (device bcache0): parent transid verify failed on
7567956930560 wanted 613690 found 613681
[  687.913192] BTRFS (device bcache0): parent transid verify failed on
7567956930560 wanted 613690 found 613681
[  687.913231] BTRFS: failed to read tree root on bcache0
[  687.936073] BTRFS: open_ctree failed

The cache device is not going through LVM or anything of the sort, so
this is a direct failure of bcache.  Perhaps due to eraseblock
alignment and assumptions about sizes?  Either way, I've got a ton of
data to recover/restore now and I'm unhappy about it.
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux