Re: [PATCH v2 0/8] midx: prevent bitmap corruption when permuting pack order

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/15/2021 2:46 PM, Junio C Hamano wrote:
> Taylor Blau <me@xxxxxxxxxxxx> writes:
> 
>> Here is a reroll of my series which fixes a serious problem with MIDX bitmaps by
>> which they can become corrupt when permuting their pack order.
...
>> So I'm definitely open to suggestions there, but otherwise this series should go
>> a long ways towards fixing my design mistake of having the MIDX .rev file be
>> separate from the MIDX itself.
> 
> Yeah, a single file with different chunks is a good way to ensure
> atomicity of update.

I just reviewed this series for the first time. Sorry for being so
late getting to it.

I had a few minor recommendations but they are mostly nitpicks and
don't deserve holding up the series if there are no other major
comments.

> A note to reviewers.
> 
> We need to make sure that not just we can still read .rev in
> existing repositories (and convert it to the new chunk) correctly,
> but also decide what to do to older versions of Git once the
> repository is touched by this new version.  Would they be upset to
> see no .rev files or is it just the performance thing (and it is
> more correct to recompute the reverse index on the fly)?  Should the
> new chunk be made mandatory to cause them notice that they should
> not muck with the repository, or is it optional?  Things like that.

1. Can we still read a .rev?

The new test script specifically verifies that existing repositories
will continue to read their .rev upon upgrade. Their .rev files will
be replaced with the chunk during the next write.

2. What if they downgrade after the RIDX chunk is in place?

The .rev file is missing and the repo has a performance issue because
they can't use bitmaps. Correctness is not a problem. Anyone using
.rev files for server use (where bitmaps are most useful) is hopefully
already careful about downgrading Git versions.

3. Should the chunk be made mandatory?

Unfortunately, the chunk format did not follow the index format's
example of making lowercase chunk IDs required. Instead, the chunks
that are necessary for v1 are necessary forever and all other chunks
are deemed optional. Changing this would require something more
drastic like updating the version number or giving some grace period
where released versions start treating lowercase chunk IDs as required
before creating a new "required" chunk.

This does mean that if there is a version incompatibility, the RIDX
chunk will just be ignored by the older version of Git.

In terms of making this a safe format upgrade, I think Taylor has
achieved that.

The only thing I can think is that server operators might want to
deploy this version with GIT_TEST_MIDX_WRITE_REV=1 for a while, so
any need to downgrade would not suffer a performance penalty for a
missing .rev file. If that is a planned way to safely deploy this
change, then it might be worth adding a test that we safely delete
a .rev file after writing both a .rev file and a RIDX chunk. (The
RIDX chunk will be preferred, so maybe the previous .rev file hits
some logic that would skip its deletion?)

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux