On Wed, Jul 24, 2019 at 9:56 AM Darrick J. Wong <darrick.wong@xxxxxxxxxx> wrote: > > On Wed, Jul 24, 2019 at 12:07:49PM -0400, Theodore Y. Ts'o wrote: > > On Tue, Jul 23, 2019 at 11:12:31PM -0700, Darrick J. Wong wrote: > > > On Tue, Jul 23, 2019 at 11:03:54PM -0700, harshad shirwadkar wrote: > > > > Before I respond to your questions, I would like to explain how fast > > > > commits differ from ijournal in a few key aspects (I will make sure to > > > > explain it in detail in patch 00/11 and documentation): > > > > > > Please do; I hadn't realized there were also journal ondisk format > > > changes, and these must be recorded in the ext4 disk format > > > documentation. > > > > Actually, the changes are almost entirely in the on-disk journal > > layer. > > I know. > > Hmm, just as a reminder -- the ext4 disk format documentation > includes the jbd2 disk format documentation. > > > The addition of the feature flag is really a UI issue, and > > worth some discussion. > > > > One of the goals was to make it easy to allow kernels which didn't > > understand fast commit to be able to mount a file system which had > > been cleanly unmounted --- but of course, if the file system needs > > recovery, and fast commits are in the journal, we can't allow a fast > > commit oblivious kernel (or e2fsck) from trying to replay the journal. > > BTW, are there patches to fix e2fsck to replay the factcommit journal? > > > One way to do this would be with a mount option, but that's a bit ugly > > --- and a mount option in /etc/fstab will cause a failure if a kernel > > which doesn't understand that mount option is booted. > > > > So the basic idea is to have a compat feature which means, "please use > > fast commits if present", and then when the file system is mounted on > > a fast-commit capable kernel, the incompat feature meaning "we're > > using the fast commit feature". (This is same design pattern used > > with the HAS_JOURNAL compat feature and the NEEDS_RECOVERY incompat > > feature.) > > > > The next question is whether to use the compat and incompat feature > > flags in the jbd2 superblock, or ext4-specific compat flags. For the > > incompat flag, there's no reason not to use the journal incompat flag. > > But for the compat flag, we have better infrastructure for setting and > > clearing ext4-level compat feature flags. Aside from that, though, > > there's no reason why we couldn't use the s_feature_compat field in > > the journal superblock --- in which case, *all* of the on-diks format > > changes would purely be on the jbd2 side of the ledger. > > Probably better to use the journal compat flag so that the other jbd2 > users can take advantage of it ... on the other hand, the only other > user (AFAIK) is ocfs2 and HAH. > > > > Every feature flag you add doubles the size of the testing matrix. > > > If I were you I'd only want to test the (fastcommit) and (!fastcommit) > > > scenarios. > > > > Sure, absolutely. On the other hand, as the saying goes, "there comes > > a time in any project where it's time to shoot the engineers and put > > the d*mned thing into production". One of the reasons why we're super > > interested in this feature is to claw back the performance hit of > > fde872682e17 ("ext4: force inode writes when nfsd calls > > commit_metadata"). I fully expect that this feature is going to make > > big difference to a number of customer workloads, so there is some > > urgency to getting this feature into production. > > > > On the flip side, if we leave some performance wins on the table, it's > > absolutely true that it makes it harder to add those optimizations > > later, and it increases the testing load, not to mention the forwards > > and backwards compatibility issues. It's an engineering trade-off. > > <nod> I just remember hearing you complain about the size of the ext4 > testing matrix in the past and figured you would't go for adding > fastcommit in small pieces each with new feature bits. > > (I guess you could have a fastcommit_version field that increments every > time you add a new fastcommit journal item to constrain the combinatoric > explosion...) I agree, I was going to suggest the same. We would probably need to add this field in all individual fast commit blocks, since we don't have a fast commit superblock equivalent .. and changing jbd2 superblock is probably too much to ask for I guess. > > --D > > > > > - Ted