Re: Change set based shallow clone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> writes:

> I'm just coming into this discussion in the middle and know very little
> about the rev-list code, so please humor me and tell me why my
> suggestion is completely irrelevant.

Not irrelevant.

> The problem you describe seems to come from doing a depth-first display
> of each branch. Why not look at the tip of each "active" branch
> simultaneously and pick the one with the most recent date? Something
> like:

That's what we have been doing from day one.

The trouble Linus illustrated is that in a global project you
cannot rely on timestamps always being correct.  You can use
them as HINT, but you need to be prepared to do sensible things
when some people screw up the time.

> On Sat, Sep 09, 2006 at 01:05:42PM -0700, Linus Torvalds wrote:
>
>> The example is
>> 
>> 		    A		<--- tip of branch
>> 		   / \
>> 		  B   E
>>                |   |
>> 		  |   F
>> 		  | /
>> 		  C 
>> 		  |
>> 		  D
>> 		...
>> 
>> where the lettering is in "date order" (ie "A" is more recent than "B" 
>> etc). In this situation, we'd start following the branch A->B->C->D->.. 
>> before we even start looking at E and F, because they all _look_ more 
>> recent.

The ancestry graph, topologically, flows from bottom to top but
the timestamps are in F E D C B A order (A is closer to current,
F is in the most distant past).  Somebody forked from C on a
machine with slow clock, made two commits with wrong (from the
point of view of the person who made commit C anyway) timestamps,
and that side branch ended up merged with B into A.

You start following A, read A and find B and E (now B and E are
"active" in your lingo), pop B because it is the most recent.
We look at B, find C is the parent, push C into the active list
(which is always sorted by timestamp order).  Now "active" are C
and E, and C is most recent so we pop C.

In the past people suggested workarounds such as making commit-tree 
to ensure that a child commit has timestamp no older than any of
its parents by either refusing to make such or automatically
adjusting.  That would surely work around the problem, but if
somebody made a commit with a wrong timestamp far into the
future, every commit that is made after that will have
"adjusted" timestamp that pretty much is meaningless, so that
would not work well in practice.

If we had a commit generation number field in the commit object,
we could have used something like that.  Each commit gets
generation number that is maximum of its parents' generation
number plus one, and we prime the recursive definition by giving
root commits generation number of 0, and rev-list would have
used the generation number not timestamp to do the above
"date-order" sort and we will always get topology right.  Of
course fsck-objects needs to be taught to check and complain if
the generation numbers between parent and child are inverted.

But it's too late in the game now -- maybe in git version 47.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]