Re: Fix branch ancestry calculation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 24, 2006 at 08:38:58AM -0800, Keith Packard wrote:
> On Fri, 2006-03-24 at 07:46 -0800, Linus Torvalds wrote:
> > 
> > On Fri, 24 Mar 2006, David Mansfield wrote:
> > > 
> > > Anyway, I'd like to nail down some of the other nagging ancestry/branch point
> > > problems if possible.
> > 
> > What I considered doing was to just ignore the branch ancestry that cvsps 
> > gives us, and instead use whatever branch that is closest (ie generates 
> > the minimal diff). That's really wrong too (the data just _has_ to be in 
> > CVS somehow), but I just don't know how CVS handles branches, and it's how 
> > we'd have to do merges if we were to ever support them (since afaik, the 
> > merge-back information simply doesn't exists in CVS).
> 
> cvsps is more of a problem than cvs itself. Per-file branch information
> is readily available in the ,v files; each version has a list of
> branches from that version, and there are even tags marking the names of
> them. One issue that I've discovered is when files have differing branch
> structure in the same repository. That happens when a branch is created
> while files are checked out on different branches.  I'm not quite sure
> what to do in this case; I've been trying several approaches and none
> seem optimal. One remaining plan is to just attach such branches by
> date, but that assumes that the first commit along a branch occurs
> shortly after the branch is created (which isn't required).
> 
> Of course, this branch information is only created when a change is made
> to the file along said branch, so most of the repository will lack
> precise branch information for each branch. When you create a child
> branch, the files with no commits in the parent branch will never get
> branch information, so the child branch will be numbered as if it were a
> branch off of the grandparent. Globally, it is possible to reconstruct
> the entire branch structure.

If that last sentence was a typo then you already know this, but
otherwise you may be disappointed to learn that it's not _always_
possible to discern the correct ancestry tree.

The simplest counter-example is two branches where each adds one file
and no files in common are modified.  If A and B both branched off of
HEAD and each adds one file, then they should each only have one file.
But if B branched from A which branched from HEAD, then B should also
have the file that was added to A. (*)  However, the information to
distinguish these two cases isn't recorded in CVS.  

I seem to have described this example more fully in the notes I took
while writing the patch to cvsps that does the global inferrence
you're describing.  You _usually_ can make a very good guess, and the
more files that are modified, the better you can do.

BTW, those notes are still available here:
http://www.codesifter.com/cvsps-notes.txt 

If you end up comparing the ancestry tree discovered by your tool and
the tree output by a patched cvsps, I would be very interested in the
results.

-chris

(*) You can distinguish between A->B->head and B->A->head simply by
date.

-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]