Re: [PATCH v2 01/10] bundle: optionally skip reachability walk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> A repository having some unreachable objects floating in the object
> store is not corrupt.  As long as all the objects reachable from refs
> are connected, that is a perfectly sane state.
>
> But allowing unbundling with the sanity check loosened WILL corrupt
> it, at the moment you point some objects from the bundle with refs.

While all of the above is true, I think existing check done by
bundle.c::verify_bundle() is stricter than necessary.  We make sure
that the prerequiste objects exist and are reachable from the refs.
But for the purpose of ensuring the health of the repo after the
operation, it is also OK if the prerequisite objects exist and they
pass connected.c::check_connected() test to reach existing refs.
verify_bundle() that is used in unbundle() does not allow it.

In a simplest case, with a single ref and a single strand of pearls
history, with a few cruft history that are connected to the main
history but are not anchored by any ref (e.g. the tip of 'main' was
rewound recently and we haven't done any GC).

                             7---8 (cruft)
                            /
   0---1---2---3---4---5---6 refs/heads/main

And they have another repository created back when '5' was the
latest and greatest, which built three commits on top of it.

   0---1---2---3---4---5---A---B---C

They create a bundle of 5..C to update us to C.  In the meantime, we
have created three commits ourselves (6, 7, and 8) but threw two
away.

If a bundle requires commit '5', because it is reachable from an
existing ref (which points at the 'main' branch), we can unbundle it
and point a ref at the tip of the history contained within the
bundle safely.  Commit '5' being pointed by a ref means the commit,
its ancestors, and all trees and blobs referenced are available to
the repository (some may be fetched lazily from promisor), and
unless the producer lied and placed unconnected data in the bundle,
unpacking a bundle on top of '5' should give us all the objects that
are needed to complete the new tip proposed by the bundle (e.g. 'C').

                             7---8 (cruft)
                            /
   0---1---2---3---4---5---6 refs/heads/main
                        \
                         A---B---C refs/bundle-1/main

And existing check that I called "sticter than necessary" easily
makes sure that it is safe to point 'C' with our ref.

Imagine another party cloned us back when 'main' was pointing at '8'
(which we since then have rewound), and built a few commits on it.

                                   X---Y refs/bundle-2/main
                                  /
                             7---8 (cruft)
                            /
   0---1---2---3---4---5---6 refs/heads/main

As they did not know we'd rewind and discard 7 and 8, they would
have created their bundle to cover 8..Y.  The current test will fail
because '8' that is prerequisite is not reachable from any ref on
our side.  But that is too pessimistic.

As long as we haven't garbage collected so that all objects
reachable from '7' and '8' are available to this repository,
however, we should be able to unbundle the bundle that has '8' as
its prerequisite.  For that, we only need that '8' passes the
check_connected() check, which essentially means "we shouldn't find
any missing link while traversing history from '8' that stops at any
existing refs".

Again this relies on the fact that unbundling code makes sure that
incoming data is fully connected (i.e. bundle producer did not lie
about the prerequisite).




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux