When git looks for an object in a pack, it has to go through all the pack indexes. Pack order becomes important because if the pack is found early, the search stops early. Git sorts packs by mostly by mtime, favoring recent packs. This strategy however puts the main pack at the bottom because the main pack is usually created at clone time and occasional "repack -ad", so likely the oldest. Though being the largest pack, this pack is likely hit (even latest commits have unchanged parts, which is likely in the main pack). This patch changes pack order a little bit, putting the main pack on top, the rest is ordered as usual. When we look for latest changes, it'll likely miss the main pack. But after the first miss, "last_found_pack" optimization kicks in, thus reduce the miss cost. I modified find_pack_entry() to record pack misses, then do a few commonly used commands like "git diff", "git diff --cached", "git diff HEAD", "git status", "git branch -v", "git log"... Pack miss decreased in all cases. So I think this might be a good heuristic. But this is micro-optmization and probably does not give us any significant performance gain. A probably better way to improve too-many-packs situation is create meta-index in memory for small packs. An implementation note, packed_git->num_objects should be used to determine the largest pack, but that field may not be available until open_packed_git() is called. pack_size should be a good enough approximation. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> --- sha1_file.c | 21 ++++++++++++++++++++- 1 files changed, 20 insertions(+), 1 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 798cd46..05ece2d 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1072,6 +1072,24 @@ static int sort_pack(const void *a_, const void *b_) return -1; } +static int sort_pack_by_size(const void *a_, const void *b_) +{ + struct packed_git *a = *((struct packed_git **)a_); + struct packed_git *b = *((struct packed_git **)b_); + int st; + + /* + * Local packs tend to contain objects specific to our + * variant of the project than remote ones. In addition, + * remote ones could be on a network mounted filesystem. + * Favor local ones for these reasons. + */ + st = a->pack_local - b->pack_local; + if (st) + return -st; + return b->pack_size - a->pack_size; +} + static void rearrange_packed_git(void) { struct packed_git **ary, *p; @@ -1087,7 +1105,8 @@ static void rearrange_packed_git(void) for (n = 0, p = packed_git; p; p = p->next) ary[n++] = p; - qsort(ary, n, sizeof(struct packed_git *), sort_pack); + qsort(ary, n, sizeof(struct packed_git *), sort_pack_by_size); + qsort(ary + 1, n - 1, sizeof(struct packed_git *), sort_pack); /* link them back again */ for (i = 0; i < n - 1; i++) -- 1.7.3.1.256.g2539c.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html