Thanks for your suggestions. I'll hold off on sending out a new patch
(following Jonathan Nieder's suggestions [1]) until we decide if further
optimizations (for example, as suggested by Peff) need to be done.
[1] <20170510232231.GC28740@xxxxxxxxxxxxxxxxxxxxxxxxx>
On 05/11/2017 02:46 AM, Jeff King wrote:
On Wed, May 10, 2017 at 03:11:57PM -0700, Jonathan Tan wrote:
After looking at ways to solve jrnieder's performance concerns, if we're
going to need to manage one more item of state within the function, I
might as well use my earlier idea of storing unmatched refs in its own
list instead of immediately freeing them. This version of the patch
should have much better performance characteristics.
Hrm. So the problem in your original was that the loop became quadratic
in the number of refs when fetching all of them (because the original
relies on the sorting to essentially do a list-merge). Are there any
quadratic bits left?
@@ -649,6 +652,25 @@ static void filter_refs(struct fetch_pack_args *args,
if ((allow_unadvertised_object_request &
(ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1))) {
+ can_append = 1;
+ } else {
+ struct ref *u;
+ /* Check all refs, including those already matched */
+ for (u = unmatched; u; u = u->next) {
+ if (!oidcmp(&ref->old_oid, &u->old_oid)) {
+ can_append = 1;
+ goto can_append;
+ }
+ }
This is inside the nr_sought loop. So if I were to do:
git fetch origin $(git ls-remote origin | awk '{print $1}')
we're back to being quadratic. I realize that's probably a silly thing
to do, but in the general case, you're O(m*n), where "n" is number of
unmatched remote refs and "m" is the number of SHA-1s you're looking
for.
The original patch was quadratic regardless of whether we're fetching
names or SHA-1s, which can be said to be bad since it is a regression in
an existing and common use case (and I agree). This one is O(m*n), as
you said - the "quadratic-ness" only kicks in if you fetch SHA-1s, which
before this patch is impossible.
Doing better would require either sorting both lists, or storing the
oids in something that has better than linear-time lookup. Perhaps a
sha1_array or an oidset? We don't actually need to know anything about
the unmatched refs after the first loop. We just need the list of oids
that let us do can_append.
Having a sha1_array or oidset would require allocation (O(n log n) time,
I think, in addition to O(n) space) and this cost would be incurred
regardless of how many SHA-1s were actually fetched (if m is an order of
magnitude less than log n, for example, having a sha1_array might be
actually worse). Also, presumably we don't want to incur this cost if we
are fetching zero SHA-1s, so we would need to do some sort of pre-check.
I'm inclined to leave it the way it is and consider this only if the use
case of fetching many SHA1-s comes up.
AIUI, you could also avoid creating the unmatched list entirely when the
server advertises tip/reachable sha1s. That's a small optimization, but
I think it may actually make the logic clearer.
If you mean adding an "if" block at the point where we add the unmatched
ref to the unmatched list (that either adds it to the list or
immediately frees it), I think it makes the logic slightly more
complicated. Or maybe you had something else in mind?