Re: topological index field for commit objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 29, 2016 at 1:39 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Stefan Beller <sbeller@xxxxxxxxxx> writes:
>
>> On Wed, Jun 29, 2016 at 11:59 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>>> On Wed, Jun 29, 2016 at 11:31 AM, Marc Strapetz
>>> <marc.strapetz@xxxxxxxxxxx> wrote:
>>>> This is no RFE but rather recurring thoughts whenever I'm working with
>>>> commit graphs: a topological index attribute for commit objects would be
>>>> incredible useful. By "topological index" I mean a simple integer for which
>>>> following condition holds true:
>>>
>>> Look for "generation numbers" in the list archive, perhaps?
>>
>> Thanks for the pointer to the interesting discussions.
>>
>> In http://www.spinics.net/lists/git/msg161363.html
>> Linus wrote in a discussion with Jeff:
>>
>>> Right now, we do *have* a "generation number". It's just that it's
>>> very easy to corrupt even by mistake. It's called "committer date". We
>>> could improve on it.
>>
>> Would it make sense to refuse creating commits that have a commit date
>> prior to its parents commit date (except when the user gives a
>> `--dammit-I-know-I-break-a-wildy-used-heuristic`)?
>
> I think that has also been discussed in the past.

I should have guessed that and tried to find it.

> I do not think it
> would help very much in practice, as projects already have up to 10
> years (and the ones migrated from CVS, even more) worth of commits
> they cannot rewrite that may record incorrect committer dates.

Chances are that the 10 years of history may be correct time wise as long
as people don't introduce a bad date malevolently.

> You'd need something like "you can trust committer dates that are
> newer that this date" per project

and git version as old versions of git can still be used later.

> to switch between slow path and
> fast path, with an updated fsck that knows how to compute that
> number after you pulled from somebody who used that overriding
> option.

Well you could have a project setting (`config.sortedDates`)
that is automatically computed once when cloning a project and
depending on that setting you can go the fast path.

Additionally when that setting is set, you'd enforce the correct
dates in committing and merging (read pulling) to carry it on to
be true.

>
> If the use of generation number can somehow be limited narrowly, we
> may be able to incrementally introduce it only for new commits, but
> I haven't thought things through, so let me do so aloud here ;-)

I think of it as "committer date == generation number", i.e. just a
special form of noting the generation number with gaps in between.
This loses the ability to know how many commits there are at maximum
between 2 given "numbers" though, which I think is minor.

>
> Suppose we use it only for this purpose:
>
>  * When we have two commits, C1 and C2, with generation numbers G1
>    and G2, we can say "C1 cannot possibly be an ancestor of C2" if
>    G1 > G2.  We cannot say anything else based on generation
>    numbers (or lack thereof).
>
> then I think we could just say "A newly created commit must record
> generation number G that is larger than generation numbers of its
> parent commits; ignore parents that lack generation number for the
> purpose of this sentence".
>
> I am not sure if that limited use is all that useful, though.

I did *not* propose to introduce the generation number, but
rather meant:
* we already have committer date
* it works pretty well
* only a tiny patch is required to tighten the heuristic to work even better
  (going forward) by avoiding accidents in the history that have
  a committer date earlier than their parents.
* we postpone drastic changes (i.e. introduction of generation
  numbers or change of algorithms) for now.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]