On 10/4/2017 2:10 AM, Junio C Hamano wrote:
Derrick Stolee <stolee@xxxxxxxxx> writes:
...
I understand that this patch on its own does not have good numbers. I
split the
patches 3 and 4 specifically to highlight two distinct changes:
Patch 3: Unroll the len loop that may inspect all files multiple times.
Patch 4: Parse less while disambiguating.
Patch 4 more than makes up for the performance hits in this patch.
Now you confused me even more. When we read the similar table that
appears in [Patch 4/5], what does the "Base Time" column mean?
Vanilla Git with [Patch 3/5] applied? Vanillay Git with [Patch 4/5]
alone applied? Something else?
In PATCH 3, 4, and 5, I used the commit-by-commit diff for the perf
numbers, so the "Base Time" for PATCH 4 is the time calculated when
PATCH 3 is applied. The table in the [PATCH 0/5] message includes the
relative change for all commits.
I recalculated the relative change for each patch related to the
baseline (PATCH 2). Looking again, it appears I misspoke and PATCH 4
does include a +8% change for a fully-repacked Linux repo relative to
PATCH 2. Since PATCH 5 includes an optimization targeted directly at
large packfiles, the final performance gain is significant in the
fully-packed cases.
It is also worth looking at the absolute times for these cases, since
the fully-packed case is significantly faster than the multiple-packfile
case, so the relative change impacts users less.
One final note: the improvement was clearer in test p0008.1 when the
test included "sort -R" to shuffle the known OIDs. Providing OIDs in
lexicographic order has had a significant effect on the performance,
which does not reflect real-world usage. I removed the "sort -R" because
it is a GNU-ism, but if there is a good cross-platform alternative I
would be happy to replace it.
p0008.1: find_unique_abbrev() for existing objects
--------------------------------------------------
For 10 repeated tests, each checking 100,000 known objects, we find the
following results when running in a Linux VM:
| Repo | Baseline | Patch 3 | Rel % | Patch 4 | Rel % | Patch 5 | Rel % |
|-------|----------|---------|-------|---------|-------|---------|-------|
| Git | 0.09 | 0.06 | -33% | 0.05 | -44% | 0.05 | -44% |
| Git | 0.11 | 0.08 | -27% | 0.08 | -27% | 0.08 | -27% |
| Git | 0.09 | 0.07 | -22% | 0.06 | -33% | 0.06 | -33% |
| Linux | 0.13 | 0.32 | 146% | 0.14 | + 8% | 0.05 | -62% |
| Linux | 1.13 | 1.12 | - 1% | 0.94 | -17% | 0.88 | -22% |
| Linux | 1.08 | 1.05 | - 3% | 0.86 | -20% | 0.80 | -26% |
| VSTS | 0.12 | 0.23 | +92% | 0.11 | - 8% | 0.05 | -58% |
| VSTS | 1.02 | 1.08 | + 6% | 0.95 | - 7% | 0.95 | - 7% |
| VSTS | 2.25 | 2.08 | - 8% | 1.82 | -19% | 1.93 | -14% |
(Each repo has three versions, in order: 1 packfile, multiple packfiles,
and multiple packfiles and loose objects.)
Thanks,
-Stolee