Re: [PATCH 1/7] cat-file: disable object/refname ambiguity check for batch mode

Jeff King <peff@xxxxxxxx> · Mon, 15 Jul 2013 00:23:48 -0400

On Fri, Jul 12, 2013 at 12:30:07PM +0200, Michael Haggerty wrote:

> But with particular respect to "git cat-file", I see problems:
> 
> 1. get_ref_snapshot() would have to read all loose and packed refs
> within the specified subtree, because loose refs have to be read before
> packed refs.  So the call would be expensive if there are a lot of loose
> refs.  And DWIM wouldn't know in advance where the references might be,
> so it would have to set prefix="".  If many refs are looked up, then it
> would presumably be worth it.  But if only a couple of lookups are done
> and there are a lot of loose refs, then using a cache would probably
> slow things down.
> 
> The slowdown could be ameliorated by adding some more intelligence, for
> example only populating the loose refs cache after a certain number of
> lookups have already been done.

I had assumed we would load the loose ref cache progressively by
directory. Of course that only helps avoiding "refs/foo/*" when you are
interested in "refs/heads/foo". If your "refs/heads/*" is very large,
you have to load all of it, and at some point it is cheaper to do a few
stat() calls than to actually readdir() the whole directory. On the
other hand, that is basically how packed-refs work now (if we hit any
ref, we have to load the whole file), and repos with many refs would
typically pack them all anyway.

> 2. A "git cat-file --batch" process can be long-lived.  What guarantees
> would users expect regarding its lookup results?

I hadn't really intended this for general lookups, but just to speed up
the refname warning, where being out-of-date is more acceptable (since
the warning is a purely informational hint).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html