Kent Overstreet <kmo@xxxxxxxxxxxxx> writes: > > BTW - it sounds like you're ahead of me on how this is put together - could you > point me at the userspace side of hibernate that you're using (those initramfs > scripts, and in particular whatever device mapper does)? that'll help a lot. Please see https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/system/boot/stage-1-init.sh lines 128 onward. I'm just using the in-kernel suspend/hibernate functionality (swsusp) but the same probably applies to tuxonice and other solutions as well. As far as I understand the events leading up to hibernation are very similar to suspend. The kernel will notify processes and kernel threads they will be frozen. Then, when everything has prepared for suspension, instead of just putting the system to suspend, the kernel will write the full contents of the system RAM to a swap device. I'm pretty sure that's all still OK and things are in a consistent state. However, when resuming the system, some basic initialization is performed in initrd. At least SCSI/SATA controller modules and other stuff needed to find the hibernated RAM image are needed of course, but most distributions will include more stuff and use udev to find hardware and load the appropriate modules. This is where things might get nasty though. On nixos, we normally initialize bcache using udev rules. Modern versions of udev will do a quick scan of block devices when found to find their labels/types. So while waiting for stuff like disks/usb/whatever to appear, my bcache partitions get found and activated. I'm pretty sure bcache will take over from here and do some bookkeeping / flush dirty buckets, whatever. Even without writeback, things might change on-disk: udev and tools like vgscan (lvm) and "btrfs scan" might probe some magic sectors inside the newly-activated bcache device. If those aren't in the cache, they will be put there, once again changing the on-disk state. Then finally (line 190) the kernel gets instructed to check if the swap device contains a hibernated RAM image and restore that. For everything running "inside the RAM image", it's just like waking up from a normal suspend. >From this explanation, it should be clear that it is vital that no on-disk state is changed in the brief period that the initrd is setting up the system, or bcache's in-memory state (inside the resumed RAM image) will be corrupted, probably leading to disasters. Either that, or bcache should assume nothing on resume and make sure to reassemble its entire in-memory state from disk. The temporary solution I found was to not include the udev rules in the initrd so it will not get found and activated before resume. Then for normal booting (I have my root on bcache) I manually load and activate bcache _after_ seeing there is no resume image. However, this solution is ugly, because I need to repeat all other initialization steps (lvm/btrfs) afterwards. Hope this helps, Mathijs -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html