On Tue, Nov 25, 2014 at 08:24:49PM -0500, Jeff King wrote: > On Tue, Nov 25, 2014 at 08:00:51PM -0500, Jeff King wrote: > > > On Wed, Nov 26, 2014 at 09:42:42AM +0900, Mike Hommey wrote: > > > > > I have a note tree with a bit more than 200k notes. > > > > > > $ time git notes --ref foo show $sha1 > /dev/null > > > real 0m0.147s > > > user 0m0.136s > > > sys 0m0.008s > > > > > > That's a lot of time, especially when you have a script that does that > > > on a fair amount of sha1s. > > > > IIRC, the notes code populates an in-memory data structure, which gives > > faster per-commit lookup at the cost of some setup time. Obviously for a > > single lookup, that's going to be a bad tradeoff (but it does make sense > > for "git log --notes"). I don't know offhand how difficult it would be > > to tune the data structure differently (or avoid it altogether) if we > > know ahead of time we are only going to do a small number of lookups. > > But Johan (cc'd) might. > > One other question: how were your notes created? > > I tried to replicate your setup by creating one note per commit in > linux.git (over 400k notes total). I did it with one big mktree, > creating a single top-level notes tree. Doing a single "git notes show" > lookup on the tree was something like 800ms. > > However, this is not what trees created by git-notes look like. It > shards the object sha1s into subtrees (1a/2b/{36}), and I think does so > dynamically in a way that keeps each individual tree size low. The > in-memory data structure then only "faults in" tree objects as they are > needed. So a single lookup should only hit a small part of the total > tree. > > Doing a single "git notes edit HEAD" in my case caused the notes code to > write the result using its sharding algorithm. Subsequent "git notes > show" invocations were only 14ms. > > Did you use something besides git-notes to create the tree? From your > examples, it looks like you were accounting for the sharding during > lookup, so maybe this is leading in the wrong direction (but if so, I > could not reproduce your times at all even with a much larger case). So... this is interesting. I happen to have recreated the notes tree "manually", and now each git notes show takes under 10ms. Now, looking at the notes tree reflog, I see that at some point, some notes were added at the top-level of the tree, without being nested, which is strange. And it looks like it's related to how I've been adding them, through git-fast-import. I was using notemodify commands, and was using the filemodify command to load the previous notes tree instead of using the from command because I don't care about keeping the notes history. So fast-import was actually filling the notes tree as if it were starting over with whatever new notes were added with notemodify (which, in a case where there were many, it filled with one level of indirection) I'm not sure this is a case worth fixing in fast-import. I can easily work around it. Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html