Re: [RFC] git-split: Split the history of a git repository by subdirectories and ranges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano wrote:
> Josh Triplett <josh@xxxxxxxxxxxxxxx> writes:
>> Linus Torvalds wrote:
>>> And yes, that's done by the core revision parsing code, so when you do
>>>
>>> 	git log --full-history --parents -- $project
>>>
>>> you do get the rewritten parent output (of course, it's not actually 
>>> _simplified_, so you get a fair amount of duplicate parents etc which 
>>> you'd still have to simplify and which don't do anything at all).
>>>
>>> Without the "--full-history", you get a simplified history, but it's 
>>> likely to be _too_ simplified for your use, since it will not only 
>>> collapse multiple identical parents, it will also totally _remove_ parents 
>>> that don't introduce any new content.
>> Considering that git-split does exactly that (remove parents that don't
>> introduce new content, assuming they changed things outside the
>> subtree), that might actually work for us.  I just checked, and the
>> output of "git log --parents -- $project" on one of my repositories
>> seems to show the same sequence of commits as git log --parents on the
>> head commit printed by git-split $project (apart from the rewritten
>> sha1s), including elimination of irrelevant merges.
> 
> So one potential action item that came out from this discussion
> for me is to either modify --pretty=raw (or add --pretty=rawish)
> that gives the rewritten parents instead of real parents?  With
> that, you can drop the code to simplify ancestry by hand in your
> loop, and also you do not have to do the grafts inforamation
> yourself either?
> 
> If that is the case I'd be very happy.
> 
> The only thing left for us to decide is if reporting the true
> parenthood like the current --pretty=raw makes sense (if so we
> need to keep it and introduce --pretty=rawfish).
> 
> The only in-tree user of --pretty=raw seems to be git-svn but it
> only looks at path-unlimited log/rev-list from one given commit,
> so the only difference between dumping what is recorded in the
> commit object and listing what parents we _think_ the commit has
> is what we read from grafts.  I think we are safe to just "fix"
> the behaviour of --pretty=raw

I actually think I want to look further into the idea of just using git
--pretty=raw --parents -- $project, and see if I can find any corner cases
where it generates a different history than what we want.  This combination of
options seems like it provides everything we need: redundant history
simplification, parent rewriting based on simplification and grafts, and easy
parsing.  If the only case in which it differs occurs when you have two
distinct commits with identical trees, I don't know that I care too much; that
particular scenario seems unlikely to occur in any of the trees I care about,
and any sane simplification behavior for it seems OK. :) As long as it runs
correctly with various ancestor/descendant/cousin/unrelated relationships
between merged branches (which I want to test further), I think it will do the
job nicely.

- Josh Triplett

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]