On Mon, Nov 20, 2017 at 12:35 PM, Jeff King <peff@xxxxxxxx> wrote: > In theory nobody should ever ask the low-level object code > for a null sha1. It's used as a sentinel for "no such > object" in lots of places, so leaking through to this level > is a sign that the higher-level code is not being careful > about its error-checking. In practice, though, quite a few > code paths seem to rely on the null sha1 lookup failing as a > way to quietly propagate non-existence (e.g., by feeding it > to lookup_commit_reference_gently(), which then returns > NULL). > > When this happens, we do two inefficient things: > > 1. We actually search for the null sha1 in packs and in > the loose object directory. > > 2. When we fail to find it, we re-scan the pack directory > in case a simultaneous repack happened to move it from > loose to packed. > > It's debatable whether (1) is a good idea or not. The > performance drop is at least linear in the number of > lookups, so it's not too bad. And if by some 2^-160th chance > somebody does have such an object, we probably ought to > access it. On the other hand, enough code paths treat the > null sha1 specially that the object probably isn't usable, > anyway. > > Problem (2), on the other hand, is a pretty severe > performance issue. If you have a large number of packs, > rescanning the pack directory can be very expensive. And it > only helps in the case that you have an object with the null > sha1 _and_ somebody was simultaneously repacking it. The > tradeoff is certainly a bad one there. > > In an ideal world, we'd simply fix all of the callers to > notice the null sha1 and avoid passing it to us. But a > simple experiment to catch this with a BUG() shows that > there are a large number of code paths. > > In the meantime, let's address problem (2). That's the > minimal fix we can do to make the performance problems go > away. p5551 shows this off (when a fetched ref is new, the > "old" sha1 is 0{40}, which ends up being passed for > fast-forward checks, the status table abbreviations, etc): > > Test HEAD^ HEAD > -------------------------------------------------------- > 5551.4: fetch 5.51(5.03+0.48) 0.17(0.10+0.06) -96.9% > > We could address (1), too, but there's not much performance > improvement left to make. > > Signed-off-by: Jeff King <peff@xxxxxxxx> > --- > This is the minimal fix that addresses the performance issues. > I'd actually have no problem at all declaring that looking up a null > sha1 is insane, and having the object-lookup routines simply return "no > such object" without even doing the loose/pack lookup first. > > I'm also fine with going down the BUG() route and fixing the > higher-level callers, but it's a big enough task (with little enough > real-world impact) that I think it would be worth applying this in the > meantime. It would have a lot of impact in the future, when new developers are hindered mis-using the API. The (unwritten) series with introducing BUG() would help a lot in 'holding it right' and I would expect fewer performance regressions over time. The patch is impressively small for such a performance gain. Personally I think (1) (which essentially means "making null sha1 work like a regular sha1") is quite an endeavor at this point in time for this code base. As a tangent, a similar but solved problem in the diff code is how NUL in user data is treated in xdiff for example, as there we kept being careful since the beginning (though I think we don't have tests for it, so it might be broken) Stefan