Re: [PATCH 1/3] commit-reach: use one walk in remove_redundant()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/30/2021 10:59 PM, Derrick Stolee wrote:
> On 1/28/2021 3:51 PM, Junio C Hamano wrote:
>> "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>>> +		parents = c->parents;
>>> +		while (parents) {
>>> +			if (!(parents->item->object.flags & STALE)) {
>>> +				parents->item->object.flags |= STALE;
>>> +				prio_queue_put(&queue, parents->item);
>>> +			}
>>> +			parents = parents->next;
>>> +		}
>>> +	}
>>
>> So, the inner loop makes sure we won't revisit STALE parent, but
>> keep digging parents we haven't seen, and stop when the generation
>> is old enough.  What happens when there is no generation number
>> computed yet, I wonder...  We'll keep getting infinity and dig all
>> the way down to root?
> 
> If we are on commits that have no generation number yet, then we
> will walk until reaching commits in the commit-graph file that have
> a computed generation (or in the heuristic case, when we have reached
> all but one of the commits).
> 
> In the case of the commit-graph, all commits will have generation
> number "infinity". In such a case, perhaps the old algorithm _is_
> the best we can do, at least for now.
> 
> The trade-off is that we might walk more commits in unusual cases
> with few input commits. That quadratic behavior will take over as
> the input size grows, no matter if there is a commit-graph or not.
> 
> I can do a big more digging into these unusual cases, especially
> when we cannot rely on commit-graphs being present.

Indeed, the old algorithm is better for cases where generation
numbers do not help and there are multiple independent commits.

For example, this command in the Linux kernel repo changes from
0.047s in the old algorithm to 6.6s with the new algorithm:

git -c core.commitGraph=false merge-base --independent 2ecedd756908 d2360a398f0b >/dev/null

> One way to ensure we do not regress from the current behavior
> would be to condition the new algorithm with
> 
> 	if (generation_numbers_enabled(the_repository))
> 		new_algorithm();
> 	else
> 		old_algorithm();
> 
> much like in repo_is_descendant_of().

I will include this in a v2.

Thanks,
-Stolee




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux