Re: [PATCH v4 2/4] fsck: force core.checksumindex=1

Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> · Tue, 4 Apr 2017 09:13:33 -0400






On 4/4/2017 4:23 AM, Ævar Arnfjörð Bjarmason wrote:
On Tue, Apr 4, 2017 at 4:29 AM, Jeff King <peff@xxxxxxxx> wrote:
On Mon, Apr 03, 2017 at 10:31:03PM +0200, Ævar Arnfjörð Bjarmason wrote:

On Mon, Apr 3, 2017 at 8:53 PM,  <git@xxxxxxxxxxxxxxxxx> wrote:
Teach fsck to override core.checksumindex and always verify
the index checksum when reading the index.

Sorry to only chime in about this at v4.

I think this patch & the documentation you added for
core.checksumindex in 1/4 would be much less confusing if you made
this a on-by-default command-line option like e.g. "full".

With this patch nothing amends the documentation to indicate that the
core.checksumindex is magically overridden when fsck runs, I think
something like this (needs amending to integrate) on top would make
this clearer:

I think that is the wrong direction to reduce confusion. We don't need
more options to twiddle this flag, we need fewer. For instance, in your
proposed documentation:

@@ -61,6 +61,11 @@ index file, all SHA-1 references in `refs`
namespace, and all reflogs
        object pools.  This is now default; you can turn it off
        with --no-full.

+--[no-]checksum-index:
+       Validate the checksum at the end of the index file, on by
+       default, locally overrides any "core.checksumIndex" setting
+       unless negated. See linkgit:git-config[1].

That tells us _what_ it does, but I'm left scratching my head with
"why".

I don't think there is any reason you would want fsck not to compute
that checksum (just like there is no option to ask fsck not to check
pack sha1 trailers).

I would go so far as to say that the config option itself is unnecessary
in this iteration of the series. I only asked for it so that we could
test the verification code paths (both for correctness and performance).
But if fsck can exercise the code path, we can check correctness that
way. And for performance, it's probably enough to test two separate
builds (which Jeff has already done).

Junio also asked for the usual "add a config, and then later we'll flip
the default" procedure. But IMHO that isn't necessary here. Nobody
should ever care about this flag. It was largely useless to check it on
every read in the first place. And if you suspect there's corruption in
your repository, you should run "git fsck".

The part that confused my & I found unintuitive is that there's a new
core.WHATEVER config that'll get silently overridden by a specific
command, git-fsck.

Nothing else I can think of in core.* works like this, i.e. it's a
namespace for "applies to all of git", core.editor, core.ignoreCase
etc.

My "force_core_checksum_index" global override was a bit of a hack.
I looked at having fsck explicitly set the "core.checksumindex" config
value (which would write to the disk in the code paths I followed)
before it loaded the index, but if we ever find an invalid checksum,
read_cache() would call die() and fsck would not have a chance to set
it back in the user's config.  So I introduced the global override.


Having git-fsck have a command-line option that's on by default as I
suggested is one way to get out of that confusion. It makes it a
special case of a CLI option overriding some config.

I looked at doing that, but thought it would be overkill since
no one is likely to care about turning it off -- or rather, fsck
should always do it whenever it reads the index.


But yeah, another way to resolve this is to get rid of the config
option altogether, or document in git-config.txt that
core.checksumIndex is obeyed by everything except git-fsck.


If there's no objections then, I'll just remove the config
setting and keep the force_ global for fsck.

Jeff