On Fri, Dec 10, 2021 at 04:22:20PM +0100, Lukas Czerner wrote: > > There are couple of places in ext4 where we change superblock using > journal; set features, generate s_encrypt_pw_salt, modify s_last_orphan, > s_last_mounted and there is also ext4_update_super() in > flush_stashed_error_work(). Also all the wild things done in resize.c. > > I think we should consolidate all or most of those under a single helper > and I was thiking about how to go about it cleanly. There are some changes which I think need to be restricted to only the primary superblock. This includes updates to s_last_orphan, s_last_mounted, and flushing error information by flush_stashed_error_work(). The last is because if we've found some kind of file system corruption, the problem might have been in a corrupted superblock. So we don't necessary want to mess with the backup superblocks, since that might propagate the problem to all of the backup superblocks. And s_last_orphan, s_last_mounted, are updated all the time, and they should only be updating the primary superblock because (a) the performance impacts if we need to update multiple sueprblocks, and (b) one of the ways we can avoid backup superblocks from being corrupted is to avoid updating them. So we should only be updating backup superblocks when we *have* to, because we're updating something fundamental about the file system metadata --- such as the size of the file system, when we're doing an online resize --- or we're changing the UUID, or the Label, etc. BTW, updating features is something that we generally avoid in new kernel code; we've done it in the past, but it's better for the system adminsirator to explicitly needs to enable a feature, as opposed to having the kernel updating the feature when we create a huge file, for example. > We could play games with modifying s_es directly, which just points > into s_sbh. And rely on the fact that it's read only once so we > maybe should be able to modify it and then do the journal dance > afterwards? I don't know if that's even allowed, but it sounds weird > to me. Well, that's how we do things today when we update the primary superblock, and I think that's the right thing to do thing. We need to modify s_sbh->b_data in any case so that the journalling works correctly in any case. For the backup superblocks, for the ones which we are updating as part of the transaction, we need to do it via a their bh using the jbd2 updating protocol. What I have in mind is to pass into the "update superblock" function a callback function which actually modifies the appropriate primary or backup superblock. So there would be a callback function that updates s_uuid, or s_volume_name, etc. So the updating_superblock function would do the following: * get a handle that updates 3 blocks (the primary and two backups) * call jbd2_journal_get_write_access() on s_bh * call callback function to update primary superblock (s_bh) * update the superblock checksum * call ext4_handle_dirty_metadata on s_bh * For the first two backup superblocks - get a bh for the backup superblock - if the bh is not buffer_verified, check the checksum on the backup superblock, and if it is not valid, call ext4_error() indicating that the backup superblock is invalid, and skip updating it. - get write access on the bh for the backup superblock - call the callback funnction to update the backup superblock - call ext4_handle_dirty_metadata * call jbd2_journal_stop(handle) Does this make sense to you? > One disadvantage might be that on-disk modification from userspace on > mounted file systems will not work anymore, since it will always be > overwriten by the in-kernel in-memory representation. Well, eventually I'd like to deprecate, and perhaps outright disallow on-disk modification from userspace. But we need to create ioctls that can do everything tune2fs can validly do on a mounted file system, and then wait to make sure the newer version e2fsprogs has been installed everywhere that where a user might try to install an updated kernel before we can change the kernel to disallow direct updates to the ext4 superblock. Given that users may be installing random upstream kernel on a RHEL or SLES system, and they may be doing that without updating e2fsprogs first, anything which breaks current versions of e2fsprogs is going to cause a huge amount of pain when a platinum customer calls either Red Hat or Google Cloud's support personnel, and you and I won't want to get dragged into a support call with an irate customer with a huge cloud or RHEL spend and where the customer relationship exec is trying very hard to keep the customer happy.... So out of sheer self defense, it's going to be a while before we can deprecate direct modification of the superblock by programs like tune2fs. As in, probably 8 to 10 years. :-/ - Ted