Thanks a lot for your very elaborate and informative reply :). You were spot-on with everything actually (including the initrd script that I made) and the extra information helped me to understand what really is going on here. I've tried your suggestion with the debugfs command instead of using tune2fs and that worked beautifully. I hope that this info will help others as well, as I couldn't find anything on google about this. Thanks again. On Mon, Jun 4, 2018 at 12:14 AM, Theodore Y. Ts'o <tytso@xxxxxxx> wrote: > On Fri, Jun 01, 2018 at 12:47:05PM +0200, Maarten van Malland wrote: >> I have a not so common setup that IMHO triggers a bug in the Ext4 journal code. I have the following setup: >> >> - A mdadm RAID10 device with Bcache backing and LVM on top. This should actually not matter at all, but perhaps still worth mentioning. >> - The Ext4 volume resides on a LVM VG, with an external journal on a NVMe drive. >> - I use LVM snapshotting for that volume >> >> Now, when I make the snapshot I do the following: >> >> lvremove /dev/bcache/root-snap >> lvcreate -c 512 -I 512 -n root-snap -L 250G -s /dev/bcache/root >> tune2fs -O ^has_journal /dev/bcache/root-snap (to get rid of the external journal) >> tune2fs -O has_journal /dev/bcache/root-snap (to create a new internal journal) >> >> When finished, I can mount /dev/bcache/root-snap just fine, with the >> internal journal working. However, when I reboot it's a different >> issue. For whatever reason the kernel still sees both >> /dev/bcache/root and /dev/bcache/root-snap with an external journal! > > I suspect that's not what is going on. The problem is that external > journals predate snapshot support, and external journals aren't very > well supported in the first place, because so few people use them. > > The other thing to understand about external journals is that both the > external journal and the file system each have a UUID, and the file > system superblock, in addition to its UUID, has the UUID for the > external journal which is it using. And the external journal, in > addition to its UUID, has a list of UUID's for the file systems that > is using the external journal. (There is partial support to allow > multiple file systems to use the same journal; which was never > completed.) > > So when you created the snapshot: > > lvremove /dev/bcache/root-snap > lvcreate -c 512 -I 512 -n root-snap -L 250G -s /dev/bcache/root > > This created a new block device which had the same file system UUID as > the orignal file system. When you then attempted to remove the > external journal: > > tune2fs -O ^has_journal /dev/bcache/root-snap > > ... this cleared the external journal's UUID from > /dev/bcache/root-snap. However, this *also* removed the UUID of > /dev/bcache/root and /dev/bcache/root-snap from the external journal. > > This was fine while /dev/bcache/root remains mounted. But then when > you next tried to remount /deb/bcache/root, the mount would have > failed, because while /deb/bcache/root has a pointer (via a UUID) to > the external journal, the external journal no longer has a > back-pointer (via UUID) to /dev/bcache/root. > > You didn't say what the script in initrd was that fixed it, but I'm > guessing it was something like: > > tune2fs -O ^has_journal /dev/bcache/root > > Which would have resulted in the warning message: > > tune2fs 1.44.2 (14-May-2018) > Filesystem's UUID not found on journal device. <====== > Journal removed > > Followed by something like: > > tune2fs -J device=/dev/bcache/journal /deb/bcache/root > > > The fundamental problem is that there is deep assumption that file > system UUID's are unique. This is needed for mounting-by-uuid to > work, for example. Creating snapshots which aren't emphameral breaks > this assumption so it's not just external journals which have this > problem. If you have "UUID=xxxx" in your /etc/fstab, it's going to > cause confusion as well. > > So the quick workaround for your problem is to use this instead of > "tune2fs -O ^has_journal /dev/bcache/root-snap": > > debugfs -w /deb/bcache/root-snap << EOF > features ^has_journal > set_super_value journal_uuid null > set_super_value journal_dev 0 > quit > EOF > > Regards, > > - Ted