On Friday, September 30, 2011 04:06:31 pm Martin Fick wrote: > On Friday, September 30, 2011 03:02:30 pm Martin Fick wrote: > > On Friday, September 30, 2011 10:41:13 am Martin Fick > > wrote: > > Since a full sync is now done to about 5mins, I broke > > down the output a bit. It appears that the longest > > part (2:45m) is now the time spent scrolling though > > each > > > > change still. Each one of these takes about 2ms: > > * [new branch] refs/changes/99/71199/1 -> > > > > refs/changes/99/71199/1 > > > > Seems fast, but at about 80K... So, are there any > > obvious N loops over the refs happening inside each of > > of the [new branch] iterations? > > OK, I narrowed it down I believe. If I comment out the > invalidate_cached_refs() line in write_ref_sha1(), it > speeds through this section. > > I guess this makes sense, we invalidate the cache and > have to rebuild it after every new ref is added? > Perhaps a simple fix would be to move the invalidation > right after all the refs are updated? Maybe > write_ref_sha1 could take in a flag to tell it to not > invalidate the cache so that during iterative updates it > could be disabled and then run manually after the > update? OK, this thing has been bugging me... I found some more surprising results, I hope you can follow because there are corner cases here which have surprising impacts. ** Important fact: ** --------------- ** When I clone my repo, it has about 4K tags which ** come in packed to the clone. ** This fact has a heavy impact on how I test things. If I choose to delete these packed-refs from the cloned repo and then do a fetch of the changes, all of the tags are also fetched along with these changes. This means that if I want to test the impact of having packed-refs vs no packed refs, on my change fetches, I need to first delete the packed-refs file, and second fetch all the tags again, so that when I fetch the changes, the repo only actually fetches changes, not all the tags! So, with this in mind, I have discovered, that the fetch performance degradation by invalidating the caches in write_ref_sha1() is actually due to the packed-refs being reloaded and resorted again on each ref insertion (not the loose refs)!!! Remember the important fact above? Yeah, those silly 4K refs (not a huge number, not 61K!) take a while to reread from the file and sort. When this is done for 61K changes, it adds a lot of time to a fetch. The sad part is that, of course, the packed-refs don't really need to be invalidated since we never add new refs as packed refs during a fetch (but apparently we do during a clone)! Also noteworthy is that invalidating the loose refs, does not cause a big delay. Some data: 1) A fetch of the changes in my series with all good external patches applied takes about 7:30min. 2) A fetch of the changes with #1 invalidate_cache_refs() commented out in write_ref_sha1() takes about 1:50min. 3) A fetch of the changes with #1 with invalidate_cache_refs() in write_ref_sha1() replaced with a call to my custom invalidate_loose_cache_refs() takes about 1:50min. 4) A fetch with #1 on a repo with packed-refs deleted after the clone, takes about ~5min. ** This is a strange regression which threw me off. In this case, all the tags are refetched in addition to the changes, this seems to cause some weird interaction that makes things take longer than they should (#5 + #6 = 2:10m << #4 5min). 5) A fetch with #1 on a repo with packed-refs deleted after the clone, and then a fetch done to get all the tags (see #6), takes only 1:30m!!!! 6) A fetch to get all the **TAGS** with packed-refs deleted after the clone, takes about 40s. ---Additional side data/tests: 7) A fetch of the changes with #1 and a special flag causing the packed-refs to be read from the file, but not parsed or sorted, takes 2:34min. So just the repeated reads add at least 40s. 8) A fetch of the changes with #1 and a special flag causing the packed-refs to be read from the file, parsed, but NOT sorted, takes 3:40min. So the parsing appears to take an additional minute at least. I think that all of this might explain why no matter how good Michael's intentions are with his patch series, his series isn't likely to fix this problem unless he does not invalidate the packed-refs after each insertion. I tried preventing this invalidation in his series to prove this, but unfortunately, it appears that in his series it is no longer possible to only invalidate just the packed-refs? :( Michael, I hope I am completely wrong about that... Are there any good consistency reasons to invalidate the packed refs in write_ref_sha1()? If not, would you accept a patch to simply skip this invalidation (to only invalidate the loose refs)? Thanks, -Martin -- Employee of Qualcomm Innovation Center, Inc. which is a member of Code Aurora Forum -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html