Re: [PATCH v4 6/7] revision.c: generation-based topo-order algorithm

Jakub Narebski <jnareb@xxxxxxxxx> · Fri, 26 Oct 2018 18:55:22 +0200

Derrick Stolee <stolee@xxxxxxxxx> writes:
> On 10/22/2018 9:37 AM, Jakub Narebski wrote:
>> "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
[...]
>> Sidenote: the new algorithm looks a bit like Unix pipeline, where each
>> step of pipeline does not output much more than next step needs /
>> requests.
>
> That's essentially the idea.

Some of the newer languages have built-in support for similar kind of
pipeline for connecting processes, be it channels in Go, supplies and
suppliers in Perl6.  I wonder if there exists some library implementing
this kind of construct in C.

That aside, I wonder if when there would be support for more
reachability indices than generation numbers, if it wouldn't be better
to pass a commit as a limiter (up to this commit), than specific indices
like current passing of generation number.  Just food for thoughs...

[...]
>>> In my local testing, I used the following Git commands on the
>>> Linux repository in three modes: HEAD~1 with no commit-graph,
>>> HEAD~1 with a commit-graph, and HEAD with a commit-graph. This
>>> allows comparing the benefits we get from parsing commits from
>>> the commit-graph and then again the benefits we get by
>>> restricting the set of commits we walk.
>>>
>>> Test: git rev-list --topo-order -100 HEAD
>>> HEAD~1, no commit-graph: 6.80 s
>>> HEAD~1, w/ commit-graph: 0.77 s
>>>    HEAD, w/ commit-graph: 0.02 s
>>>
>>> Test: git rev-list --topo-order -100 HEAD -- tools
>>> HEAD~1, no commit-graph: 9.63 s
>>> HEAD~1, w/ commit-graph: 6.06 s
>>>    HEAD, w/ commit-graph: 0.06 s
>>>
>>> This speedup is due to a few things. First, the new generation-
>>> number-enabled algorithm walks commits on order of the number of
>>> results output (subject to some branching structure expectations).
>>> Since we limit to 100 results, we are running a query similar to
>>> filling a single page of results. Second, when specifying a path,
>>> we must parse the root tree object for each commit we walk. The
>>> previous benefits from the commit-graph are entirely from reading
>>> the commit-graph instead of parsing commits. Since we need to
>>> parse trees for the same number of commits as before, we slow
>>> down significantly from the non-path-based query.
>>>
>>> For the test above, I specifically selected a path that is changed
>>> frequently, including by merge commits. A less-frequently-changed
>>> path (such as 'README') has similar end-to-end time since we need
>>> to walk the same number of commits (before determining we do not
>>> have 100 hits). However, get the benefit that the output is
>>> presented to the user as it is discovered, much the same as a
>>> normal 'git log' command (no '--topo-order'). This is an improved
>>> user experience, even if the command has the same runtime.
>>>
>> First, do I understand it correctly that in first case the gains from
>> new algorithms are so slim because with commit-graph file and no path
>> limiting we don't hit repository anyway; we walk less commits, but
>> reading commit data from commit-graph file is fast/
>
> If you mean 0.77s to 0.02s is "slim" then yes, it is because the
> commit-graph command already made a full walk of the commit history
> faster. (I'm only poking at this because the _relative_ improvement is
> significant, even if the command was already sub-second.)

First, you didn't provide us with percentages, i.e. relative improvement
(and I am lazy).  Second, 0.02s can be within the margin of error,
depending on how it is measured, and how stable this measurement is.

>> Second, I wonder if there is some easy way to perform automatic latency
>> tests, i.e. how fast does Git show the first page of output...
>
> I have talked with Jeff Hostetler about this, to see if we can have a
> "time to first page" traced with trace2, but we don't seem to have
> access to that information within Git. We would need to insert it into
> the pager. The "-100" is used instead.

Perhaps another solution to the problem of "first page of output"
latency tests could be feasible, namely create a helper test-pager-1p
"pager" program that would automatically quit after first page of
output; or perhaps even one that benchmarks each page of output
automatically.

There exists 'pv' (pipe viewer) program for pipes, so I think it would
be possible to do equivalent, but as a pager.

[...]
>>> +static inline void test_flag_and_insert(struct prio_queue *q, struct commit *c, int flag)
>>> +{
>>> +	if (c->object.flags & flag)
>>> +		return;
>>> +
>>> +	c->object.flags |= flag;
>>> +	prio_queue_put(q, c);
>>> +}
>>
>> This is an independent change, though I see that it is quite specific
>> (as opposed to quite generic prio_queue_peek() operation added earlier
>> in this series), so it does not make much sense as standalone change.
>>
>> It inserts commit into priority queue only if it didn't have flags set,
>> and sets the flag (so we won't add it to the queue again, not without
>> unsetting the flag), am I correct?
>
> Yes, this pattern of using a flag to avoid duplicate entries in the
> priority queue appears in multiple walks. It wasn't needed before. We
> call it four times in the code below.

>> I guess that we use test_flag_and_insert() instead of prio_queue_put()
>> to avoid duplicate entries in the queue.  I think the queue is initially
>> populated with the starting commits, but those need not to be
>> unreachable from each other, and walking down parents we can encounter
>> starting commit already in the queue.  Am I correct?
>
> We can also reach commits in multiple ways, so the initial conditions
> are not the only ways to insert duplicates.

Right.

[...]
>>> +	if (c->object.flags & UNINTERESTING)
>>> +		mark_parents_uninteresting(c);
>>> +
>>> +	for (p = c->parents; p; p = p->next)
>>> +		test_flag_and_insert(&info->explore_queue, p->item, TOPO_WALK_EXPLORED);
>>
>> Do we need to insert parents to the queue even if they were marked
>> UNINTERESTING?
>
> We need to propagate the UNINTERESTING flag to our parents. That
> propagation happens in process_parents().

I think I understand.  We need to propagate UNINTERESTING flag down the
chain, isn't it?

[...]
>>>     static void init_topo_walk(struct rev_info *revs)
>>>   {
>>>   	struct topo_walk_info *info;
>>> +	struct commit_list *list;
>>>   	revs->topo_walk_info = xmalloc(sizeof(struct topo_walk_info));
>>>   	info = revs->topo_walk_info;
>>>   	memset(info, 0, sizeof(struct topo_walk_info));
>>>   -	limit_list(revs);
>>> -	sort_in_topological_order(&revs->commits, revs->sort_order);
>>> +	init_indegree_slab(&info->indegree);
>>> +	memset(&info->explore_queue, '\0', sizeof(info->explore_queue));
>>> +	memset(&info->indegree_queue, '\0', sizeof(info->indegree_queue));
>>> +	memset(&info->topo_queue, '\0', sizeof(info->topo_queue));
>>
>> Why this memset uses '\0' as a filler value and not 0?  The queues are
>> not strings [and you use 0 in other places].

I think you missed answering about this issue.

[...]
>> Looks all right.
>
> Thanks for taking the time on this huge patch!

You are welcome.

-- 
Jakub Narębski