On Sun, Jan 07 2018, Derrick Stolee jotted: > git log --oneline --raw --parents > > Num Packs | Before MIDX | After MIDX | Rel % | 1 pack % > ----------+-------------+------------+--------+---------- > 1 | 35.64 s | 35.28 s | -1.0% | -1.0% > 24 | 90.81 s | 40.06 s | -55.9% | +12.4% > 127 | 257.97 s | 42.25 s | -83.6% | +18.6% > > The last column is the relative difference between the MIDX-enabled repo > and the single-pack repo. The goal of the MIDX feature is to present the > ODB as if it was fully repacked, so there is still room for improvement. > > Changing the command to > > git log --oneline --raw --parents --abbrev=40 > > has no observable difference (sub 1% change in all cases). This is likely > due to the repack I used putting commits and trees in a small number of > packfiles so the MRU cache workes very well. On more naturally-created > lists of packfiles, there can be up to 20% improvement on this command. > > We are using a version of this patch with an upcoming release of GVFS. > This feature is particularly important in that space since GVFS performs > a "prefetch" step that downloads a pack of commits and trees on a daily > basis. These packfiles are placed in an alternate that is shared by all > enlistments. Some users have 150+ packfiles and the MRU misses and > abbreviation computations are significant. Now, GVFS manages the MIDX file > after adding new prefetch packfiles using the following command: > > git midx --write --update-head --delete-expired --pack-dir=<alt> (Not a critique of this, just a (stupid) question) What's the practical use-case for this feature? Since it doesn't help with --abbrev=40 the speedup is all in the part that ensures we don't show an ambiguous SHA-1. The reason we do that at all is because it makes for a prettier UI. Are there things that both want the pretty SHA-1 and also care about the throughput? I'd have expected machine parsing to just use --no-abbrev-commit. If something cares about both throughput and e.g. is saving the abbreviated SHA-1s isn't it better off picking some arbitrary size (e.g. --abbrev=20), after all the default abbreviation is going to show something as small as possible, which may soon become ambigous after the next commit.