On Tue, Apr 02, 2024 at 03:42:40PM +0200, Jan Kara wrote: > On Tue 02-04-24 17:09:51, Ye Bin wrote: > > We encountered a problem that the file system could not be mounted in > > the power-off scenario. The analysis of the file system mirror shows that > > only part of the data is written to the last commit block. > > To solve above issue, if commit block checksum is incorrect, check the next > > block if has valid magic and transaction ID. If next block hasn't valid > > magic or transaction ID then just drop the last transaction ignore checksum > > error. Theoretically, the transaction ID maybe occur loopback, which may cause > > the mounting failure. > > > > Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx> > > So this is curious. The commit block data is fully within one sector and > the expectation of the journaling is that either full sector or nothing is > written. So what kind of storage were you using that it breaks these > expectations? I suppose if the physical sector size is 512 bytes, and the file system block is 4k, I suppose it's possible that on a crash, that part of the 4k commit block could be written. In *practice* though, this is super rare. That's because on many modern HDD's, the physical sector size is 4k (because the ECC overhead is much lower), even if the logical sector size is 512 byte (for Windows 98 compatibility). And even on HDD's where the physical sector size is really 512 bytes, the way the sectors are laid out in a serpentine fashion, it is *highly* likely that 4k write won't get torn. And while this is *possible*, it's also possible that some kind of I/O transfer error --- such as some bit flips which breaks the checksum on the commit block, but also trashes the tid of the subsequent block, such that your patch gets tricked into thinking that this is the partial last commit, when in fact it's not the last commit, thus causing the journal replay abort early. If that's case, it's much safer to force fsck to be run to detect any inconsistency that might result. In general, I strongly recommend that fsck be run on the file system before you try to mount it. Yeah, historically the root file system gets mounted read-only, and then fsck gets run on it, and if necessary, fsck will fix it up and then force a reboot. Ye, I'm assuming that this is what you're doing, and so that's why you really don't want the mount to fail? If so, the better way to address this is to use an initramfs which can run fsck on the real root file system, and then mount it, and then use pivot_root and then exec'ing the real init program. That way, even the journal is corrupted in that way, fsck will attempt to replay the journal, fail, and you can have fsck do a forced fsck to fix up the file system. - Ted