On Thu, Oct 13, 2016 at 08:18:11PM +0200, Vegard Nossum wrote: > > My guess is that the number is relatively high. And that would explain > > why nobody else has really complained much; such a pattern is probably > > uncommon. > > I get ~3,700 objects "they are advertising that we don't have". > > They are all indeed tags which I don't have (nor want) locally. Thanks. That's about what I expected. > So without your patch I get these numbers: > > % cumulative self self total > time seconds seconds calls s/call s/call name > 37.34 29.83 29.83 1948265116 0.00 0.00 strip_suffix_mem > 27.56 51.85 22.02 11358 0.00 0.01 prepare_packed_git_one > 14.24 63.23 11.38 1924457702 0.00 0.00 strip_suffix > 4.88 67.13 3.90 26045319 0.00 0.00 find_pack_entry_one > 4.17 70.46 3.33 84828149 0.00 0.00 hashcmp > 3.28 73.08 2.62 79760199 0.00 0.00 hashcmp > 2.44 75.03 1.95 const_error > > the call graph data shows all the reprepare_packed_git() calls to come > from a chain like this: > > do_for_each_ref > do_for_each_ref_iterator > ref_iterator_advance > files_ref_iterator_advance > ref_resolves_to_object > has_sha1_file > has_sha1_file_with_flags > reprepare_packed_git Hrm. It seems weird that we'd hit reprepare_packed_git() via do_for_each_ref(), because that's looking at _your_ refs. So any time that code path hits reprepare_packed_git(), it should be complaining about repository corruption to stderr. And that also wouldn't make sense that my patch would improve it. Are you sure you've read the perf output correctly (I'll admit that I am often confused by the way it reports call graphs)? > With your (first) patch I get this instead: > > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 29.41 0.25 0.25 4490399 0.00 0.00 hashcmp > 16.47 0.39 0.14 843447 0.00 0.00 find_pack_entry_one These functions appearing at the top are probably a sign that you have too many packs (these are just object lookup, which has to linearly try each pack). So I'd expect things to improve after a git-gc (and especially if you turn off the packSizeLimit). > Do you have all the profile data you were interested in before I try a > 'git gc'? Yes, I think so. At least I'm satisfied that there's not another HAS_SHA1_QUICK case that I'm missing. > I really appreciate the quick response and the work you put into fixing > this, even when it's my fault for disabling gc, thanks! No problem. I do think you'll benefit a lot from packing everything into a single pack, but it's also clear that Git was doing more work than it needed to be. This was a known issue when we added the racy-check to has_sha1_file(), and knew that we might have to field reports like this one. -Peff