So this is my first reboot in anger of my new writearound bcache (not my first reboot, but my first reboot after letting the cache populate itself: it's still 85% empty and has never needed to GC). As usual, I get a timeout error from bcache on restart, right before rebooting, but then, at boot... # Register all bcaches. if [ -f /sys/fs/bcache/register_quiet ]; then for name in /dev/sd*[0-9]* /dev/md/*; do echo $name > /sys/fs/bcache/register_quiet 2>&1 done # New devices registered: create them, after a short delay # to let the registration happen. sleep 1 /sbin/mdev -s fi ... does *this* (including the messages showing that the md array it's caching is happy): [ 11.281907] md: md125 stopped. [ 11.294948] md/raid:md125: device sda3 operational as raid disk 0 [ 11.305620] md/raid:md125: device sdf3 operational as raid disk 4 [ 11.315899] md/raid:md125: device sdd3 operational as raid disk 3 [ 11.325770] md/raid:md125: device sdc3 operational as raid disk 2 [ 11.335245] md/raid:md125: device sdb3 operational as raid disk 1 [ 11.344688] md/raid:md125: raid level 6 active with 5 out of 5 devices, algorithm 2 [ 11.353810] md125: detected capacity change from 0 to 15761089757184 [ 11.468956] bcache: prio_read() bad csum reading priorities [ 11.478010] bcache: prio_read() bad magic reading priorities [ 11.497911] bcache: error on 314dcdd2-9869-4110-99cc-9cd3a861afa6: [ 11.497914] bad checksum at bucket 28262, block 0, 36185 keys [ 11.507021] , disabling caching [ 11.529823] bcache: register_cache() registered cache device sde2 [ 11.539054] bcache: cache_set_free() Cache set 314dcdd2-9869-4110-99cc-9cd3a861afa6 unregistered [ 11.558596] bcache: register_bdev() registered backing device md125 This then leaves me without a rootfs until I thrash around and figure out how to detach the cache and leave me with a working backing device again. The machine has ECCRAM, and the SSD is one of the Intel DC ones with supercapacitors etc, so I don't think we can blame either of those parts. This is software killing itself with no assistance needed from hardware, I think. This is writearound, without most of the horrible complexities of writeback cache invalidation: all the cache has to note is that a given block has been written and should be invalidated on next read. So I don't think we can blame that machinery, either. This is just the bcache failing to do its job writing on shutdown, presumably because it spontaneously times out instead. (Why?!) The machine has heaps of RAM (128GiB) so you can't rely on memory pressure writing stuff out -- much of the time, there is none. I suspect that's the problem here... if it's doing any writing at all, two seconds is not remotely long enough -- at the rated 480MiB/s (yeah right), my SSD might take up to *250 seconds* to finish its writeout. Two seconds is not remotely long enough if it's trying to let a writeout finish. Several points spring to mind: - Why does it time out on reboot, rather than, say, trying to write enough out that it does not crash on restart? Why does it only wait for a fixed, very short, timespan? - Bad checksums are all very well, but in writearound mode it should come up *anyway*, sans cache, since the cache is devoid of dirty data - Bad checksums are all very well, but in writearound mode it should discard the minimum possible -- in this case, one bucket -- and keep going with 99% of the cache intact. I still have the cache in question sitting there on the block device if anyone wants a look at it. (Some of what it contains is an ordinary xfs fs: the rest is cryptsetup stuff.) Anyone know what could be wrong here, and how I can prevent it happening again? One presumes I can get an empty cache back by just wipefsing and re-make_bcaching the cache device, but I'd rather not do that until I know if anyone wants to take a look at it. -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html