Re: [RFC] git-split: Split the history of a git repository by subdirectories and ranges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds wrote:
> On Mon, 23 Oct 2006, Josh Triplett wrote:
>  - The nice one that doesn't throw away potentially interesting 
>    duplicate paths to reach the same end result. We don't have this one, 
>    so no git commands do this yet.
> 
>    The way to do this one would be "--full-history", but then removing all 
>    parents that are "redundant". In other words, for any merge that 
>    remains (because of the --full-history), check if one parent is a full 
>    superset of another one, and if so, remove the "dominated" parent, 
>    which simplifies the merge. Continue until nothing can be simplified 
>    any more.
> 
>    This would _usually_ end up giving the same graph as the "extreme" 
>    simplification, but if there were two branches that really _did_ 
>    generate the same end result using different commits, they'd remain in 
>    the end result.
> 
> The problem with the "nice one" is that it's expensive as hell. There may 
> be clever tricks to make it less so, though. But I think it's the 
> RightThing(tm) to do, at least as an option for when you really want to 
> see a reasonable history that still contains everything that is relevant.

So, if a commit has more than one parent (a merge), you want to
eliminate any parents that end up as ancestors to other parents in the
merge (including if their head has the same commit ID), but not
eliminate multiple parents with different head commits but the same tree
object?  That seems simple enough; I *think* git-split actually already
does that, though I haven't actually tested that particular case.  If
git log eliminates all but one of the parents with different commits but
the same tree, I believe the commit sequence generated by git-split will
differ from that of git log in that case, by including all such parents.

I do agree that the behavior you describe seems like the best
simplification, and I don't think the alternative you describe as
"extreme simplification" makes any sense at all (picking a parent
arbitrarily), nor does it seem any simpler to generate; either way, you
still have to figure out if one parent has another as an ancestor, while
the additional "extreme simplification" just *adds* a comparison of tree
hashes.

Or have I misunderstood the case you have concerns about?  Why would the
"nice" format incur additional cost?

- Josh Triplett


Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]