Re: [PATCH v7] read-cache: force_verify_index_checksum

Johannes Schindelin <Johannes.Schindelin@xxxxxx> · Tue, 25 Apr 2017 23:04:33 +0200 (CEST)

Hi,

On Mon, 24 Apr 2017, Junio C Hamano wrote:

> Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> writes:
> 
> >>> +test_expect_success 'detect corrupt index file in fsck' '
> >>> +    cp .git/index .git/index.backup &&
> >>> +    test_when_finished "mv .git/index.backup .git/index" &&
> >>> +    echo zzzzzzzz >zzzzzzzz &&
> >>> +    git add zzzzzzzz &&
> >>> +    sed -e "s/zzzzzzzz/yyyyyyyy/" .git/index >.git/index.yyy &&
> >>
> >> sed on a binary file? Sooner or later we are going to run into
> >> portability issues.
> >
> > In v5 of this patch series I used "perl" and it was suggested that I
> > use "sed" instead.  It doesn't matter to me which we use.  My testing
> > showed that it was safe, but that was only Linux.

I am sorry to hear that the Git mailing list's review gives you whiplash.

The problem with sed is that BSD sed behaves a bit differently than GNU
sed, and we quietly expect every contributor to be an expert in the
portability aspects of sed.

TBH I am quite surprised that anybody would have suggested to use sed
rather than Perl to edit binary files in the first place. In my opinion,
that was bad advice.

> > Does the mailing list have a preference for this ?
> 
> Instead of munging pathnames z* to y*, I'd prefer to see the actual
> checksum bytes at the end replaced in the index file.  After all
> that is what this test really cares about, and it ensures that the
> failure detected is due to checksum mismatch.

I see that v8 uses a Perl script again, and it is well written and
obvious.

Just in case that certain reviewers favor length over readability, let me
offer this snippet:

	size=$(perl -e "print -s \".git/index\"") &&
	dd if=/dev/zero of=.git/index bs=1 seek=$(($size-20) count=20

Since whatever hash will be used in the future is most likely larger than
20 bytes, this should still work fine (and even if somebody sane replaces
the SHA-1 of the index with a CRC-32 for the same benefit we have now, the
test will fail quickly and it is easy to replace the 20 by 4).

Ciao,
Dscho