Derrick Stolee <derrickstolee@xxxxxxxxxx> writes: > I'll focus on this area today and see what I can learn and how I > can approach this problem in a different way. The current options > that I see are: > > 1. Leave verify_bundle() as-is and figure out how to refresh the > refs. (This would remain a stricter check than necessary.) Even if we switch to "assume everything is OK, remember a few key facts (like prerequisites and tips) about each bundle as we unpack them, and sanity check the results at the end" approach, doesn't that last step require us to be able to see the final state of the refs? If so, wouldn't we need to figure out how to refresh the refs no matter what? > 2. Find out how to modify verify_bundle() so it can do the more > relaxed connectivity check. I am not sure what kind of relaxing you have in mind, but as long as we can guarantee the connectedness of the end result > 3. Take the connectivity check that fetch uses before updating > refs and add that check before updating refs in the bundle URI > code. This is optional at much lower priority, isn't it? In the second example in the message you are responding to, I do not think it is too bad to reject the bundle based on '8' that has been rewound away (in other words, a bundle publisher ought to be basing their bundles on well publicized and commonly available commits). Only when we try to be overly helpful to such a use case, it becomes necessary to loosen the rule from "all prerequisites must be reachable from existing refs" to "or prerequisites that are not reachable from any refs are also OK if they pass check_connected()". The current check to require that prerequisites are reachable from refs does not have to check trees and blobs, because any commit that is reachable from an existing ref is complete[*] by definition. Let's define a term: a commit is "complete" iff it is not missing any objects that it (recursively) references to. The check done by check_connected() is more expensive because it has to prove that a commit, which is found in the object store and may or may not be reachable from any refs, is complete. The tranversal still can take advantage of the fact that commits _reachable_ from refs are guaranteed to be complete, but until the traversal reaches a commit that is reachable from refs (e.g. when inspecting commits '8' and then '7' until it reaches '6', in the second example in the message you are responding to) we need to look at trees and blobs. > There could also be a combination of (2) and (3), or others I have > not considered until I go poking around in the code. > > I'll let you know what I find. Thanks. Unlike areas that allow glitches as long as workarounds are available (e.g. UI), the object store + refs layer is where it is absolutely required to be correct. I am happy to see capable minds are on it.