Re: [PATCH v2 01/10] bundle: optionally skip reachability walk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Derrick Stolee <derrickstolee@xxxxxxxxxx> writes:

> I'll focus on this area today and see what I can learn and how I
> can approach this problem in a different way. The current options
> that I see are:
>
>  1. Leave verify_bundle() as-is and figure out how to refresh the
>     refs. (This would remain a stricter check than necessary.)

Even if we switch to "assume everything is OK, remember a few key
facts (like prerequisites and tips) about each bundle as we unpack
them, and sanity check the results at the end" approach, doesn't
that last step require us to be able to see the final state of the
refs?  If so, wouldn't we need to figure out how to refresh the refs
no matter what?

>  2. Find out how to modify verify_bundle() so it can do the more
>     relaxed connectivity check.

I am not sure what kind of relaxing you have in mind, but as long as
we can guarantee the connectedness of the end result

>  3. Take the connectivity check that fetch uses before updating
>     refs and add that check before updating refs in the bundle URI
>     code.

This is optional at much lower priority, isn't it?  In the second
example in the message you are responding to, I do not think it is
too bad to reject the bundle based on '8' that has been rewound away
(in other words, a bundle publisher ought to be basing their bundles
on well publicized and commonly available commits).  Only when we
try to be overly helpful to such a use case, it becomes necessary to
loosen the rule from "all prerequisites must be reachable from
existing refs" to "or prerequisites that are not reachable from any
refs are also OK if they pass check_connected()".

The current check to require that prerequisites are reachable from
refs does not have to check trees and blobs, because any commit that
is reachable from an existing ref is complete[*] by definition.

    Let's define a term: a commit is "complete" iff it is not
    missing any objects that it (recursively) references to.

The check done by check_connected() is more expensive because it has
to prove that a commit, which is found in the object store and may
or may not be reachable from any refs, is complete.  The tranversal
still can take advantage of the fact that commits _reachable_ from
refs are guaranteed to be complete, but until the traversal reaches
a commit that is reachable from refs (e.g. when inspecting commits
'8' and then '7' until it reaches '6', in the second example in the
message you are responding to) we need to look at trees and blobs.

> There could also be a combination of (2) and (3), or others I have
> not considered until I go poking around in the code.
>
> I'll let you know what I find.

Thanks.  Unlike areas that allow glitches as long as workarounds are
available (e.g. UI), the object store + refs layer is where it is
absolutely required to be correct.  I am happy to see capable minds
are on it.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux