[RFC] New commit object headers: generation and note headers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As new major git release 1.6.0 is close (BTW. I wonder if git would ever 
reach/get 2.0.0 release...), I'd like to sum up here, adding my own 
thoughts and comments, ideas about extending commit object by adding 
new headers. I think it would be better to have such major feature 
introduced in major release, and not with only minor number changed.
For some headers the faster it is introduced the better.


1. 'generation' header

In the "[BUG?] git log picks up bad commit" thread:
  http://permalink.gmane.org/gmane.comp.version-control.git/72274
later "[RFH] revision limiting sometimes ignored" there was resurrected 
idea of the 'generation' header. This header is meant to simplify 
removing uninteresting commits in the presence of clock skew, to 
replace various commit-time related heuristics.

The proposed solution (which was at least once discussed in the past on 
git mailing list) is to use for this "generation number":
 1. For parentless (root) commits it equals 1 (or 0)
 2. For each commit, it equals maximum of generation numbers of parents,
    plus 1.
Of course to not to have to recalculate it from beginning it must be 
saved somewhere. Best solution is to use 'generation' header for that.

Unfortunately there is complication that commits written before this 
header introduced doesn't have generation number handy. It was proposed
then to use generation number if possible, and fallback to old date 
based heuristic if it does not exist, and do not (re)calculate it;
the idea is to avoid such cost.

My comments:
============
The problem is twofold: when to calculate generation header, and what to 
do with commits that lacks it. We could require to calculate generation 
header when creating a commit (commit, amend, rebase, filter-branch), 
but this might mean that a few first commits after 'generation' header 
is introduced would be much slower.

As for older commits which lacks generation number header: perhaps some 
(pack)-index-like external storage/cache, where generation numbers will 
be saved as we generate them? And perhaps some command to generate 
generation numbers in advance, in a free time.

Note that keeping generation numbers externally to the object database 
is more error prone (cache sync), and would not propagate.

The question is if to take grafts and shallows when creating version 
numbers: if they are to be saved in object database, then no. If saving 
to external pack-index like storage, then perhaps.


2. 'note' header (no semantical meaning)

There was some time ago discussion about adding 'note' header, initially 
to save original sha-1 of a commit for cherry-picking and rebase; then 
for saving explicit rename or corrected rename info, for saving chosen 
merge strategy, and for saving original ID of SCM import.

My comments:
============

>From all those I think what makes most sense is saving foreign SCM ID 
for a commit, for commits imported from other SCM. This way we do not 
have to parse commit message (fragile and ugly, and makes it harder for 
two-way exchange: no pristine commit message), or store them externally 
(not propagated, prone to be lost).

Another would be to save rename and copy info when importing from 
another SCM which tracks renames and not detects code movement. This 
would allow (at least theoretically) for lossless import. When 
detecting renames, in the process of finding common merge base(s), we 
could check and take into account such information. It would be purely 
advisory...

-- 
Jakub Narebski
Poland
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux