On 4/3/2018 6:18 AM, Ævar Arnfjörð Bjarmason wrote:
On Tue, Apr 03 2018, Lars Schneider wrote:
What is the state of this series? I can't find it in git/git nor in
git-for-windows/git. I think Stolee mentioned the config in
his Git Merge talk [1] and I was about to test it/roll it out :-)
It's in the gvfs branch of git@xxxxxxxxxx:Microsoft/git.git, i.e. it's
not in Git for Windows, but used in Microsoft's own in-house version
used for Windows.git.
Thanks for adding me to CC. I mentioned it in my talk because that was
one thing we shipped internally as a "quick fix" until we could do the
right thing.
If I remember correctly, Jeff abandoned shipping this upstream because
it did have the feel of a hack and we wanted to see if users used the
config setting or really cared about the output values. We saw fast
adoption of the feature and even turned the config setting on
automatically in the following version of GVFS.
I may be misunderstanding this feature, but my impression was that it
was a kludge as a workaround until the commit graph code landed, because
once we have that then surely we can just cheaply report the actual (or
approximate?) number in the common case, but of course it may still be
slow if your commit graph file is out of date.
You are correct that the commit-graph file may be out of date, causing
slower performance. Even worse: the current graph patch only provides a
constant-multiple speedup (still walking the same number of commits, but
each commit is parsed much faster).
Speaking of our GVFS-specific fork [0], the 'gvfs' branch was updated
just yesterday with a couple of changes that I am prepping for
submission upstream:
* Lazy-load trees when parsing commits from commit-graph [1]
* Compute and consume generation numbers [2]
Each of these will speed up this ahead/behind calculation in different
ways. [1] makes the cost of loading each commit a bit faster, saving up
to 20% overall. [2] uses generation numbers in paint_down_to_common() to
make the while() condition O(1) instead of O(Q) where Q is the size of
the priority queue. The Windows repo is particularly "wide" with many
parallel branches being merged in complicated ways, so the queue becomes
quite large. This use of generation numbers saves about 4% on some
ahead/behind calculations. This speedup is modest, but the existing code
already made good use of limiting the commit walk to be mostly the
"important" commits.
The real benefit of generation numbers will manifest in a way to make
--topo-order much faster when rendering a small number of commits.
The generation numbers _could_ be used to approximate the ahead/behind
calculation in the following way: When comparing A and B, and gen(A) <
gen(B), then A is at least (gen(B) - gen(A)) behind. That's the only
information that can be gathered directly from those values, but may be
enough to short circuit an exact count.
To truly accelerate these ahead/behind calculations to be sub-linear* in
the ahead/behind counts, we would need a bitmap-based approach. The
object-reachability bitmap is a non-starter for client machines in the
Windows repo, but perhaps a commit-reachability bitmap could be
interesting. Performing set operations on the bitmaps could more quickly
answer these questions. Just thinking about it makes me want to go down
a deep rabbit hole, investigating ways to compute, store, and use these
bitmaps. However: let's wait and see how necessary it is as the
commit-graph feature stabilizes. (*These bitmap approaches are not
guaranteed to be sub-linear, because it may include iterating through a
list of O(N) bits, but good run-length encodings will likely make the
count operation very fast, even with a set-difference operation included.)
There are too many fun things to work on, not enough time!
Thanks,
-Stolee
[0] https://github.com/microsoft/git
Fork of GitForWindows that ships to Windows developers
[1]
https://github.com/Microsoft/git/commit/29114bf86f591f5c87075f779a1faa2d0f17b92f
Lazy-load trees when parsing commits from commit-graph
(accidentally squashed to one commit)
[2]
https://github.com/microsoft/git/compare/879b7d3b1bddea2587b28cdd656c9c655018683a...a0731ca93a35fd042560c4b30e8e0edbdfa4bf9f
Compute and consume generation numbers