Re: [PATCH v3 7/7] revision.c: refactor basic topo-order logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/25/2018 5:43 AM, Jeff King wrote:
On Thu, Oct 11, 2018 at 12:21:44PM -0400, Derrick Stolee wrote:

2. INDEGREE: using the indegree_queue priority queue (ordered
     by maximizing the generation number), add one to the in-
     degree of each parent for each commit that is walked. Since
     we walk in order of decreasing generation number, we know
     that discovering an in-degree value of 0 means the value for
     that commit was not initialized, so should be initialized to
     two. (Recall that in-degree value "1" is what we use to say a
     commit is ready for output.) As we iterate the parents of a
     commit during this walk, ensure the EXPLORE walk has walked
     beyond their generation numbers.
I wondered how this would work for INFINITY. We can't know the order of
a bunch of INFINITY nodes at all, so we never know when their in-degree
values are "done". But if I understand the EXPLORE walk, we'd basically
walk all of INFINITY down to something with a real generation number. Is
that right?

But after that, I'm not totally clear on why we need this INDEGREE walk.
The INDEGREE walk is an important element for Kahn's algorithm. The final
output order is dictated by peeling commits of "indegree zero" to ensure all
children are output before their parents. (Note: since we use literal 0 to
mean "uninitialized", we peel commits when the indegree slab has value 1.)

This walk replaces the indegree logic from sort_in_topological_order(). That
method performs one walk that fills the indegree slab, then another walk
that peels the commits with indegree 0 and inserts them into a list.
I guess my big question here was: if we have generation numbers, do we
need Kahn's algorithm? That is, in a fully populated set of generation
numbers (i.e., no INFINITY), we could always just pick a commit with the
highest generation number to show.

So if we EXPLORE down to a real generation number in phase 1, why do we
need to care about INDEGREE anymore? Or am I wrong that we always have a
real generation number (i.e., not INFINITY) after EXPLORE? (And if so,
why is exploring to a real generation number a bad idea; presumably
it's due to a worst-case that goes deeper than we'd otherwise need to
here).

The issue is that we our final order (used by level 3) is unrelated to generation number. Yes, if we prioritized by generation number then we could output the commits in _some_ order that doesn't violate topological constraints. However, we are asking for a different priority, which is different than the generation number priority.

In the case of "--topo-order", we want to output the commits reachable from the second parent of a merge before the commits reachable from the first parent. However, in most cases the generation number of the first parent is higher than the second parent (there are more things in the merge chain than in a small topic that got merged). The INDEGREE is what allows us to know when we can peel these commits while still respecting the priority we want at the end.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux