Re: [PATCH 01/11] ext4: add handling for extended mount options

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 24, 2019 at 12:07:49PM -0400, Theodore Y. Ts'o wrote:
> On Tue, Jul 23, 2019 at 11:12:31PM -0700, Darrick J. Wong wrote:
> > On Tue, Jul 23, 2019 at 11:03:54PM -0700, harshad shirwadkar wrote:
> > > Before I respond to your questions, I would like to explain how fast
> > > commits differ from ijournal in a few key aspects (I will make sure to
> > > explain it in detail in patch 00/11 and documentation):
> > 
> > Please do; I hadn't realized there were also journal ondisk format
> > changes, and these must be recorded in the ext4 disk format
> > documentation.
> 
> Actually, the changes are almost entirely in the on-disk journal
> layer.

I know.

Hmm, just as a reminder -- the ext4 disk format documentation
includes the jbd2 disk format documentation.

> The addition of the feature flag is really a UI issue, and
> worth some discussion.
> 
> One of the goals was to make it easy to allow kernels which didn't
> understand fast commit to be able to mount a file system which had
> been cleanly unmounted --- but of course, if the file system needs
> recovery, and fast commits are in the journal, we can't allow a fast
> commit oblivious kernel (or e2fsck) from trying to replay the journal.

BTW, are there patches to fix e2fsck to replay the factcommit journal?

> One way to do this would be with a mount option, but that's a bit ugly
> --- and a mount option in /etc/fstab will cause a failure if a kernel
> which doesn't understand that mount option is booted.
> 
> So the basic idea is to have a compat feature which means, "please use
> fast commits if present", and then when the file system is mounted on
> a fast-commit capable kernel, the incompat feature meaning "we're
> using the fast commit feature".  (This is same design pattern used
> with the HAS_JOURNAL compat feature and the NEEDS_RECOVERY incompat
> feature.)
> 
> The next question is whether to use the compat and incompat feature
> flags in the jbd2 superblock, or ext4-specific compat flags.  For the
> incompat flag, there's no reason not to use the journal incompat flag.
> But for the compat flag, we have better infrastructure for setting and
> clearing ext4-level compat feature flags.  Aside from that, though,
> there's no reason why we couldn't use the s_feature_compat field in
> the journal superblock --- in which case, *all* of the on-diks format
> changes would purely be on the jbd2 side of the ledger.

Probably better to use the journal compat flag so that the other jbd2
users can take advantage of it ... on the other hand, the only other
user (AFAIK) is ocfs2 and HAH.

> > Every feature flag you add doubles the size of the testing matrix.
> > If I were you I'd only want to test the (fastcommit) and (!fastcommit)
> > scenarios.
> 
> Sure, absolutely.  On the other hand, as the saying goes, "there comes
> a time in any project where it's time to shoot the engineers and put
> the d*mned thing into production".  One of the reasons why we're super
> interested in this feature is to claw back the performance hit of
> fde872682e17 ("ext4: force inode writes when nfsd calls
> commit_metadata").  I fully expect that this feature is going to make
> big difference to a number of customer workloads, so there is some
> urgency to getting this feature into production.
> 
> On the flip side, if we leave some performance wins on the table, it's
> absolutely true that it makes it harder to add those optimizations
> later, and it increases the testing load, not to mention the forwards
> and backwards compatibility issues.  It's an engineering trade-off.

<nod> I just remember hearing you complain about the size of the ext4
testing matrix in the past and figured you would't go for adding
fastcommit in small pieces each with new feature bits.

(I guess you could have a fastcommit_version field that increments every
time you add a new fastcommit journal item to constrain the combinatoric
explosion...)

--D

> 
>     	      		    	     	     - Ted



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux