On Thu, Oct 4, 2012 at 1:21 AM, Jeff King <peff@xxxxxxxx> wrote: > On Thu, Oct 04, 2012 at 12:32:35AM +0200, Ævar Arnfjörð Bjarmason wrote: > >> On Wed, Oct 3, 2012 at 8:03 PM, Jeff King <peff@xxxxxxxx> wrote: >> > What version of git are you using? In the past year or so, I've made >> > several tweaks to speed up large numbers of refs, including: >> > >> > - cff38a5 (receive-pack: eliminate duplicate .have refs, v1.7.6); note >> > that this only helps if they are being pulled in by an alternates >> > repo. And even then, it only helps if they are mostly duplicates; >> > distinct ones are still O(n^2). >> > >> > - 7db8d53 (fetch-pack: avoid quadratic behavior in remove_duplicates) >> > a0de288 (fetch-pack: avoid quadratic loop in filter_refs) >> > Both in v1.7.11. I think there is still a potential quadratic loop >> > in mark_complete() >> > >> > - 90108a2 (upload-pack: avoid parsing tag destinations) >> > 926f1dd (upload-pack: avoid parsing objects during ref advertisement) >> > Both in v1.7.10. Note that tag objects are more expensive to >> > advertise than commits, because we have to load and peel them. >> > >> > Even with those patches, though, I found that it was something like ~2s >> > to advertise 100,000 refs. >> >> FWIW I bisected between 1.7.9 and 1.7.10 and found that the point at >> which it went from 1.5/s to 2.5/s upload-pack runs on the pathological >> git.git repository was none of those, but: >> >> ccdc6037fe - parse_object: try internal cache before reading object db > > Ah, yeah, I forgot about that one. That implies that you have a lot of > refs pointing to the same objects (since the benefit of that commit is > to avoid reading from disk when we have already seen it). > > Out of curiosity, what does your repo contain? I saw a lot of speedup > with that commit because my repos are big object stores, where we have > the same duplicated tag refs for every fork of the repo. Things are much faster with your monkeypatch, got up to around 10 runs/s. The repository mainly contains a lot of git-deploy[1] generated tags which are added for every rollout to several subsystems. Of the ~50k references in the repo 75% point to a commit that no other reference points to. Around 98% of the references are annotated tags, the rest are branches. 1. https://github.com/git-deploy/git-deploy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html