On Thu, Jul 7, 2011 at 17:24, Junio C Hamano <gitster@xxxxxxxxx> wrote: > This optimizes the "recency order" (see pack-heuristics.txt in > Documentation/technical/ directory) used to order objects within a > packfile in three ways: Yay! > - Commits at the tip of tags are written together, in the hope that > revision traversal done in incremental fetch (which starts by > putting them in a revision queue marked as UNINTERESTING) will see a > better locality of these objects; Putting these together after the first tagged commit is an interesting approach. Currently JGit drops these after 1024 recent commits have been produced, the alternative here parks them around the most recent release. I wonder which is really the better approach for the upload-pack workload. I chose 1024 because linux-2.6 seemed to be getting several thousand commits per MB of pack data, so 1024 would park the tagged commits within the first MB (making a cold-cache upload-pack slightly faster), but doesn't harm `git log` going through the pager too badly because this block is 1024 commits back. Have you tried putting commits in parse order rather than recency order? By this I mean the revision traversal code needs to parse the parents of a commit in order to get their commit date and enqueue them into the priority queue at the right position. The order the commits get parsed in is different from the order they are popped in, especially when a commit is created on an older revision and there is much concurrent activity on a different branch between the commit and its ancestor. This pattern is very typical in repositories that pull from others to merge changes in... aka linux, git.git, etc. My intuition says emulating the priority queue in pack-objects and using that to influence the order commits are written out (so its the order commits enter the queue, rather than leave it) will further reduce minor page faults during the `git log >/dev/null` test, and possibly also help the other log based workloads. Its certainly a lot more work to code than this patch, but I wonder if it produces better ordering. > - In the original recency order, trees and blobs are intermixed. Write > trees together before blobs, in the hope that this will improve > locality when running pathspec-limited revision traversal, i.e. > "git log paths..."; FWIW JGit has been doing "commits-then-trees-then-blobs" for a long time now and we have generally found the same results as you did here, segregating by object type helps page faults. > - When writing blob objects out, write the whole family of blobs that use > the same delta base object together, by starting from the root of the > delta chain, and writing its immediate children in a width-first > manner, in the hope that this will again improve locality when reading > blobs that belong to the same path, which are likely to be deltified > against each other. This is an interesting approach, and one I had not considered before. > * Simple commit-only log. > > $ git log >/dev/null ... > 95% of the pack accesses look at data that is no further than 260kB > from the previous location we accessed. The patch does not change the > order of commit objects very much, and the result is very similar. I think a more interesting thing to examine is how often do we have to "skip back" to an earlier part of the file. Consider the case that the ~190MBish of commits does not fully fit into kernel buffer cache, but we do have read-ahead support in the kernel. Forward references are relatively free, because read-ahead should populate that page before we need it. Backward references are horribly expensive, because they may have been evicted to make room for more read-ahead. From what I could tell of similar traces in JGit, the recency commit ordering causes a lot more backwards references than the parse ordering. > * Pathspec-limited log. > > $ git log drivers/net >/dev/null > > The path is touched by 26551 commits and merges (among 254656 total). > > v1.7.6 with patch > Total number of access : 559,511 558,663 ... > 70.0% percentile : 319,574,732 110,370,100 > 80.0% percentile : 361,647,599 123,707,229 > 90.0% percentile : 393,195,669 128,947,636 > 95.0% percentile : 405,496,875 131,609,321 ... Does this result suggest that we needed less of the pack in-memory in order to produce the result? That is, on a cold cache we should be touching fewer pages of the pack when this patch was used to create it, and fewer page references would allow the command to complete more quickly on a cold cache. > Note that a half-a-gigabyte packfile comfortably fits in the buffer cache, > and you would unlikely to see much performance difference on a modern and > reasonably beefy machine with enough memory and local disks. Benchmarking > with cold cache (or over NFS) would be interesting. It would also be easy to test these cases by setting the pack window size to something small (e.g. 1 MB) and the pack limit to something equally small (e.g. 25 MB), and stuffing a delay of 20 ms into the code path that xmmaps a new window when its not already opened. I'm glad you started working on this, it looks like it may lead to a better pack ordering. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html