Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes: > Hi, > > On Tue, 8 Sep 2009, Johan Herland wrote: > >> Algorithm / Notes tree git log -n10 (x100) git log --all >> ------------------------------------------------------------ >> next / no-notes 4.77s 63.84s >> >> before / no-notes 4.78s 63.90s >> before / no-fanout 56.85s 65.69s >> >> 16tree / no-notes 4.77s 64.18s >> 16tree / no-fanout 30.35s 65.39s >> 16tree / 2_38 5.57s 65.42s >> 16tree / 2_2_36 5.19s 65.76s >> >> flexible / no-notes 4.78s 63.91s >> flexible / no-fanout 30.34s 65.57s >> flexible / 2_38 5.57s 65.46s >> flexible / 2_2_36 5.18s 65.72s >> flexible / ym 5.13s 65.66s >> flexible / ym_2_38 5.08s 65.63s >> flexible / ymd 5.30s 65.45s >> flexible / ymd_2_38 5.29s 65.90s >> flexible / y_m 5.11s 65.72s >> flexible / y_m_2_38 5.08s 65.67s >> flexible / y_m_d 5.06s 65.50s >> flexible / y_m_d_2_38 5.07s 65.79s > > It's good to see that the no-notes behaves roughly like baseline. > > I can see that some people may think that date-based fan-out is the cat's > ass, Actually, my knee-jerk reaction was that 4.77 (next) vs 5.57 (16tree with 2_38) is already a good enough performance/simplicity tradeoff, and 5.57 vs 5.08 (16tree with ym_2_38) probably does not justify the risk of worst case behaviour that can come from possible mismatch between the access pattern and the date-optimized tree layout. But that only argues against supporting _only_ date-optimized layout. Support of "flexible layout" is not that flexible as its name suggests; one single note tree needs to have a uniform fanout strategy. But it is not unusably rigid either; you only need to be extra careful when merging two notes trees. We can leave the heuristics to choose what the optimum layout to later rounds. > - I find the restriction to commits rather limiting. Yeah, we would not want to be surprised to find many people want to annotate non-commits with this mechanism. > - most of the performance difference between the date-based and the SHA-1 > based fan-out looks to me as if the issue was the top-level tree. > Basically, this tree has to be read _every_ time _anybody_ wants to read > a note. A comparison between 'next' and another algorithm that opens the top-level notes tree object and returns "I did not find any note" without doing anything else would reveal that cost. But when you are doing "log -n10" (or "log --all"), you would read the notes top-level tree once, and it is likely to be cached in the obj_hash[] (or in delta_base cache) already for the remaining invocations, even if notes mechanism does not do its own cache, which I think it does, no? > - I'd love to see performance numbers for less than 157118 notes. Don't > get me wrong, it is good to see the worst-case scenario in terms of > notes/commits ratio. But it will hardly be the common case, and I > very much would like to optimize for the common case. > > So, I'd appreciate if you could do the tests with something like 500 > notes, randomly spread over the commits (rationale: my original > understanding was that the notes could amend commit messages, and that > is much more likely to be done with relatively old commits that you > cannot change anymore). Hmph, is that a typical use case? How does it relate to CC's object replacement mechanism? Also Gitney talked about annotating commits in the code-review thing. What's the expected notes density and distribution in that application? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html