"Chris Lee" <clee@xxxxxxx> writes: > I've been running some experiments, as hinted earlier by the > discussion about just how much git-index-pack sucks (which, really, > isn't much since the gaping memleak is gone now). > > These experiments include trying to see if there's a noticeable > performance improvement by splitting out objects of different types > into different packs. So far, it definitely seems to make a > difference, though not the one I was initially expecting. For all of > these tests, I did 'sysctl -w vm.drop_caches=3' before running, to > effectively simulate a cold-cache run. Are you running on a 64-bit machine or 32-bit? I wonder what the numbers would be if you partition into the same number of packs of similar sizes as your experiment, but partitioning based on not by type but by age or other factors. What I am getting at is that you may not be seeing the effect of access pattern based on the type at all. For example, the performance can be affected by other factors, such as necessity to use smaller number of pack_windows per pack. use_pack() iterates through the currently active windows on a linked list per pack, and a window is 32MB on 32-bit machines, so you would literally need hundreds of them to access that 3GB pack (the total is limited to 256MB so 8 windows are recycled). It is possible that simply using more packs and knowing which pack you need to access upfront may be cutting down the cost of finding the pack window to use. A single pack would have a linked list of 8 active windows, while two packs would have one linked list of each, so the average linear search cost would be half. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html