Thanks for your comments. On 06/15/2017 07:57 PM, Michael O'Cleirigh wrote:
Hi Michael, In git if you don't merge often then you get these merge conflict hell situations. In my experience the main conflicts come not from the unified diff of those 130 commits but from differences in the surrounding code. Merging/rebase/cherrypicking directly to the latest upstream sounds impossible to me. These conflicts come from the distance between the local fork branch and the upstream branch. You need to merge through closer commits first to have a hope of getting something automatic to work. Something like getting the list of releases made in the upstream in the last 5 years and merging them in order into the fork branch. i.e. merge v1, merge v2, ... merge v300 I went through something similiar with a subversion repo we converted to git. In subversion they were cherry picking done work into a release branch. In git a feature branch mode was being used. It turned out some commits were never cherry picked and bringing them to the latest release was hard. We tried many of the approaches you outlined, took what git would give us automatically and in the most hairy cases recreated the changes on the latest upstream by reading the diff of the original commit and rewriting it on the latest code. In terms of how the history looks after the merge conflicts are resolved you could internalize the fixups into a single commit applied onto the original fork branch. So that history would show the 130 commit branch directly merged into the upstream. You would use the git-commit-tree command to reuse the merged tree id and then use it as a merge commit between the 130th commit id and the upstream commit id. Regards, Michael On Thu, Jun 15, 2017 at 8:52 PM, Michael Eager <eager@xxxxxxxxxx <mailto:eager@xxxxxxxxxx>> wrote: Hi All -- I'm working with code that is based on a five year old repository. There are 130 local commits since the repo was forked. Naturally, the upstream project has moved on significantly. I'm wondering about best approaches to updating the repo to the current upstream version. Here are the approaches I've considered: - Rebase from upstream. Likely almost every patch will fail with multiple merge conflicts. - Merge local branch into upstream. Likely many merge failures, but fewer than with rebase. - Apply individual patches from the old repo to the upstream repo. Fix merge conflicts, rebuild, fix build failures. There may be some duplication and additional merge problems created, where a later patch from the old repo fixes the same conflict or build failure. I've tried each of these approaches on various projects. Each has problems. After resolving merge issues there are build failures which need to be resolved and additional patches created. The result is that the patch history is a bit chaotic, where there are later patches which fix problems with early patches. I've tried to sort the fix patches to follow the patch they correct, so that the fixes were together and I could merge them, but that can be difficult. I've used Stacked Git a little, but don't know if it will make any of this easier. On some projects, I've reimplemented changes in the upstream repo, abandoning the patch history from the old repo: - Create diff of old repo and upstream. Apply only the changes to add new functionality, which are in the patches to the old repo. Fix problems caused by API changes, renamed files, etc. - Re-implement the changes on the upstream repo. Some of the old code would be re-used, but modified to fit in the current upstream. Some new code would be written. One other variant of the rebase approach I've thought of is to do this incrementally, rebasing the old repo against an upstream commit a short time after the old repo was forked, fixing any conflicts, rebuilding and fixing build failures. Then repeat, with a bit newer commit. Then repeat, until I get to the top. This sounds tedious, but some of it can be automated. It also might result in my making the changes compatible with upstream code which was later abandoned or significantly changed. Anyone have a different approach that I should consider? Or maybe offer advice on how to make one of these approaches work better? What is best practice to update an old repo? -- Michael Eager eager@xxxxxxxxxxxx <mailto:eager@xxxxxxxxxxxx> 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 <tel:650-325-8077>
-- Michael Eager eager@xxxxxxxxxxxx 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077