On Fri, Jun 01, 2018 at 12:47:05PM +0200, Maarten van Malland wrote: > I have a not so common setup that IMHO triggers a bug in the Ext4 journal code. I have the following setup: > > - A mdadm RAID10 device with Bcache backing and LVM on top. This should actually not matter at all, but perhaps still worth mentioning. > - The Ext4 volume resides on a LVM VG, with an external journal on a NVMe drive. > - I use LVM snapshotting for that volume > > Now, when I make the snapshot I do the following: > > lvremove /dev/bcache/root-snap > lvcreate -c 512 -I 512 -n root-snap -L 250G -s /dev/bcache/root > tune2fs -O ^has_journal /dev/bcache/root-snap (to get rid of the external journal) > tune2fs -O has_journal /dev/bcache/root-snap (to create a new internal journal) > > When finished, I can mount /dev/bcache/root-snap just fine, with the > internal journal working. However, when I reboot it's a different > issue. For whatever reason the kernel still sees both > /dev/bcache/root and /dev/bcache/root-snap with an external journal! I suspect that's not what is going on. The problem is that external journals predate snapshot support, and external journals aren't very well supported in the first place, because so few people use them. The other thing to understand about external journals is that both the external journal and the file system each have a UUID, and the file system superblock, in addition to its UUID, has the UUID for the external journal which is it using. And the external journal, in addition to its UUID, has a list of UUID's for the file systems that is using the external journal. (There is partial support to allow multiple file systems to use the same journal; which was never completed.) So when you created the snapshot: lvremove /dev/bcache/root-snap lvcreate -c 512 -I 512 -n root-snap -L 250G -s /dev/bcache/root This created a new block device which had the same file system UUID as the orignal file system. When you then attempted to remove the external journal: tune2fs -O ^has_journal /dev/bcache/root-snap ... this cleared the external journal's UUID from /dev/bcache/root-snap. However, this *also* removed the UUID of /dev/bcache/root and /dev/bcache/root-snap from the external journal. This was fine while /dev/bcache/root remains mounted. But then when you next tried to remount /deb/bcache/root, the mount would have failed, because while /deb/bcache/root has a pointer (via a UUID) to the external journal, the external journal no longer has a back-pointer (via UUID) to /dev/bcache/root. You didn't say what the script in initrd was that fixed it, but I'm guessing it was something like: tune2fs -O ^has_journal /dev/bcache/root Which would have resulted in the warning message: tune2fs 1.44.2 (14-May-2018) Filesystem's UUID not found on journal device. <====== Journal removed Followed by something like: tune2fs -J device=/dev/bcache/journal /deb/bcache/root The fundamental problem is that there is deep assumption that file system UUID's are unique. This is needed for mounting-by-uuid to work, for example. Creating snapshots which aren't emphameral breaks this assumption so it's not just external journals which have this problem. If you have "UUID=xxxx" in your /etc/fstab, it's going to cause confusion as well. So the quick workaround for your problem is to use this instead of "tune2fs -O ^has_journal /dev/bcache/root-snap": debugfs -w /deb/bcache/root-snap << EOF features ^has_journal set_super_value journal_uuid null set_super_value journal_dev 0 quit EOF Regards, - Ted