Re: [PATCH v2 01/10] bundle: optionally skip reachability walk

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 23 Jan 2023 12:13:33 -0800

Derrick Stolee <derrickstolee@xxxxxxxxxx> writes:

> We are specifically removing the requirement that the objects are
> reachable from refs, we still check that the objects are in the
> object store. Thus, we can only be in a bad state afterwards if
> the required objects for a bundle were in the object store,
> previously unreachable, and one of these two things happened:
>
> 1. Some objects reachable from those required commits were already
>    missing in the repository (so the repo's object store was broken
>    but only for some unreachable objects).

A repository having some unreachable objects floating in the object
store is not corrupt.  As long as all the objects reachable from refs
are connected, that is a perfectly sane state.

But allowing unbundling with the sanity check loosened WILL corrupt
it, at the moment you point some objects from the bundle with refs.

> I think we should trust the repository to not be in the first state,

So, I think this line of thought is simply mistaken.

>> I am OK as long as we check the assumption holds true at the end;
>> this looks like a good optimization.
>  
> So are you recommending that we verify all objects reachable from
> the new refs/bundles/* are present after unbundling?

Making sure that prerequisites are connected will reduce the span of
the DAG we would need to verify.  After unbundling all bundles, but
before updating the refs to point at the tips in the bundles, if we
can make sure that these prerequisite objects named in the bundles
are reachable from the tips recorded in the bundles, while stopping
the traversal at the tips of original refs (remember: we have only
updated objects in the object store, but haven't updated the refs from
the bundles), that would allow us to make sure that the updates to
refs proposed by the bundles will not corrupt the repository.