On Fri, Feb 19, 2010 at 2:13 PM, Avery Pennarun <apenwarr@xxxxxxxxx> wrote: > On Thu, Feb 18, 2010 at 8:04 PM, Jon Seymour <jon.seymour@xxxxxxxxx> wrote: > > Using the linearization mechanism you propose, you end up producing a > false history: one in which, other than at certain checkpoints, the > code doesn't even work. What's the point of such a history? It > neither reflects the true development history (ie. pre-linearization) > nor a more useful, idealized version of history (ie. one that compiles > at every point and adds features in a rational order and is useful for > git bisect). If there are no merge conflicts in the original history, then there will be no merge conflicts in the rewritten history, and therefore no error deltas. The point of creating the linearization of this kind is that if there are no merge conflicts, it flattens the hierarchy in a form that is immediately rebaseable and will faithfully represent the work the developer would have done if they had decided to rebase at each merge instead of merging. If there are merge conflicts, then it produces a history that indicates the extent the merge conflict rectification that will be needed which then allows you to decide whether you want to attempt the rebase. If you decide to rebase, then it should just be a question of deleting the delta commits and fixing the merge conflicts as they crop up. My contention is that most of the text diffs in the rewritten history (with the exception of the error deltas) will actually represent the intent of the developers original changes although until the rectification work is done the commit sequences bounded by error deltas would not be usable for git bisect, compiles or any other purpose that requires an intact tree. In the no conflict case, it is not clear to me that the history resulting from your script is immediately rebaseable, precisely because of the presence of the merge commits [ feel free to correct me if I am wrong about that ] . With my approach, the merge commits dissolve away - there is nothing to edit. > > It doesn't even provide something useful for patch review, since half > your patches will have randomly-selected conflict resolutions (ie. > changes to unrelated code that never should have changed) thrown in. > You'd be better off reviewing patches from the original history, and > just ignoring merge commits, which is what 'git format-patch' or just > 'git log -p' would do automatically. The conflict resolutions are far from random. They are precisely chosen to reconstruct the blob in such a way that all subsequent picks in the same path segment apply cleanly. This is a deliberate choice because we know that conflict will be resolved eventually. We are temporarily deferring correctness to allow us to automatically proceed with a speculative rewrite of the merge history as a rebase history. The extent of incorrectness in the history is well delimited and well understood. > > The result is also still not suitable for submission upstream: the > sync points (where the files actually match what the developer had in > his tree) are the only places where the code is even likely to > compile, and yet they *also* include all the code brought in by prior > merges, which you already said include code that shouldn't go > upstream. I agree it is not suitable for many purposes. I contend that what it allows one to do is rewrite the merge history as a rebase history in a form that allows the merge conflict resolutions to be deferred. In the no conflict case, the linearisation is immediately usable (with no further edits) as a rebase source. > > The linearization script I gave you at least has these interesting > characteristics: > > - If the original history compiled at every point, then the linearized > history does too. > > - It is an accurate representation of the successive states of the > tree experienced by the original developer. > > - You can use 'git rebase' to incrementally rearrange and combine > patches until they make enough sense to submit upstream. > > - It is easy to separate out merges (which usually don't need patch > review) from individual patches (which do). > > - If some merges added useless code, you can remove them completely > with rebase by just removing a single patch from the list. > > Of course, even with this script, it will still take work (rebasing) > to produce code that's polished and ready to go upstream. But I'm not > sure there's a way to automate that without producing interim versions > that are much, much worse. > > Have fun, > > Avery > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html