I <gitster@xxxxxxxxx> wrote: > - In the original recency order, trees and blobs are intermixed. Write > trees together before blobs, in the hope that this will improve > locality when running pathspec-limited revision traversal, i.e. > "git log paths..."; > > - When writing blob objects out, write the whole family of blobs that use > the same delta base object together, by starting from the root of the > delta chain, and writing its immediate children in a width-first > manner, in the hope that this will again improve locality when reading > blobs that belong to the same path, which are likely to be deltified > against each other. An interesting tangent. With a single-liner change to apply the same "write the entire family as a group" logic also to tree objects, we get a superb improvement to the "index-pack" workload, while "log pathspec" becomes measurably worse. > * Pathspec-limited log. > $ git log drivers/net >/dev/null > v1.7.6 with patch > 60.0% percentile : 688,671 443,578 > 70.0% percentile : 319,574,732 110,370,100 > 99.0% percentile : 412,942,470 133,078,115 > 99.5% percentile : 413,172,266 133,163,349 > 99.9% percentile : 413,354,356 133,240,445 > Less than 2MiB seek: 61.71% 62.87% 60.00% percentile : 2,240,841 70.00% percentile : 115,729,538 99.00% percentile : 132,912,877 99.90% percentile : 133,129,574 Less than 2MiB seek : 59.77% This of course is because trees tend to delta well against each other and writing out everything in the same family tends to coalesce ancient and recent trees close together, while the history traversal wants to see the trees at the same path in adjacent commits, and the process to jump over to the "next tree" in the same commit need to seek a lot more. > * Blame. > $ git blame drivers/net/ne.c >/dev/null > v1.7.6 with patch > 50.0% percentile : 11,248 7,334 > 95.0% percentile : 408,230,673 132,174,308 > Less than 2MiB seek: 56.47% 56.83% 50.00% percentile : 79,182 95.00% percentile : 132,547,959 Less than 2MiB seek : 56.16% This also is harmed by trees in adjacent commits being further apart, but blame needs to read a lot of blobs as well, so the effect is not as grave as tree-only "log pathspec" case. > * Index pack. > $ git index-pack -v .git/objects/pack/pack*.pack > v1.7.6 with patch > 80.0% percentile : 1,317,862 4,971 > 90.0% percentile : 11,926,385 119,398 > 99.9% percentile : 382,919,785 33,155,191 > Less than 2MiB seek: 81.73% 96.92% 80.00% percentile : 745 90.00% percentile : 1,891 99.90% percentile : 80,001 Less than 2MiB seek : 100.00% It is so obvious that the in pack object ordering with "the whole family at once" tweak is extremely tuned for this workload that the result is not even funny. The moral of the story is that index-pack is not a workload to optimize for ;-) and the following is a wrong patch to apply. In general, we have to be extremely careful not to tune for a particular workload while unknowingly sacrificing other workloads. I wonder if we would get a better locality for "blame" by not writing the whole family at once for blobs, but only a smaller subset. diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 27132bb..adcb509 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -553,7 +553,7 @@ static struct object_entry **compute_write_order(void) for (i = 0; i < nr_objects; i++) { if (objects[i].type != OBJ_TREE) continue; - add_to_write_order(wo, &wo_end, &objects[i]); + add_family_to_write_order(wo, &wo_end, &objects[i]); } /* -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html