Killed it again - enabled bcache discard, copied a few TB of data from the backup the the drive, rebooted, different error "bcache: bch_cached_dev_attach() Couldn't find uuid for <REDACTED> in set" The exciting failure that required reboot this time was an infinite spin in bcache_writeback. I'll give it another shot at narrowing down exactly what causes the failure before I give up on bcache entirely. On Sun, Apr 12, 2015 at 1:56 AM, Dan Merillat <dan.merillat@xxxxxxxxx> wrote: > On Sat, Apr 11, 2015 at 4:09 PM, Kai Krakow <hurikhan77@xxxxxxxxx> wrote: > >> With this knowledge, I guess that bcache could probably detect its backing >> device signature twice - once through the underlying raw device and once >> through the md device. From your logs I'm not sure if they were complete > > It doesn't, the system is smarter than you think it is. > >> enough to see that case. But to be sure I'd modify the udev rules to exclude >> the md parent devices from being run through probe-bcache. Otherwise all >> sorts of strange things may happen (like one process accessing the backing >> device through md, while bcache access it through the parent device - >> probably even on different mirror stripes). > > This didn't occur, I copied all the lines pertaining to bcache but > skipped the superfluous ones. > >> It's your setup, but personally I'd avoid MD for that reason and go with >> lvm. MD is just not modern, neither appropriate for modern system setups. It >> should really be just there for legacy setups and migration paths. > > Not related to bcache at all. Perhaps complain about MD on the > appropriate list? I'm not seeing any evidence that MD had anything to > do with this, especially since the issues with bcache are entirely > confined to the direct SATA access to /dev/sda4. > > In that vein, I'm reading the on-disk format of bcache and seeing > exactly what's still valid on my system. It looks like I've got > 65,000 good buckets before the first bad one. My idea is to go > through, look for valid data in the buckets and use a COW in > user-mode-linux to write that data back to the (copy-on-write version > of) the backing device. Basically, anything that passes checksum and > is still 'dirty', force-write-it-out. Then see what the status of my > backing-store is. If it works, do it outside UML to the real backing > store. > > Are there any diagnostic tools outside the bcache-tools repo? Not much > there other than show the superblock info. Otherwise I'll just finish > writing it myself. -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html