On 2020-12-25 23:24, M. Buecher wrote:
Dear all, I finally had the time to start converting some older Subversion repositories to git repositiers and run into issues with repos using "parallel" branches, so-called vendor branches [1] in Subversion to track upstream changes via snapshots in a separate branch and merge them into the custom build on trunk. Just wanting to convert the Subversion history as-is to git. I studied the git-svn reference docs [2] plus the related chapter of the ProGit book [3] and assume that I understood how git-svn works. Still I'm not a git expert, just a sporadic user. Somehow `git svn fetch` (2.29.2.windows.3) looses merge information between vendor branch and trunk plus tags are referencing the predecessors instead of the original ancestor. Maybe there is a manual way to fix this for that small repository, but it wouldn't be feasible for larger repositories, that's why I deceided to write this bug report. Is there anything I missed? (see my procedure below) Fortunately I ran quite early into these issues and also with a very small repository of just 11 commits, so I can provide a small reproduction case (see below after the links). Hoping you can enhance `git svn fetch`.
Tested further with re-created Subversion repositories, that either looked the same but made sure that svn:mergeinfo is present (Subversion >=1.5 via "cherry pick merge", no "2-URL merge"), or that has the vendor branch being copied from trunk (both attached). Only when the vendor branch was a copy from trunk, then svn-git got the merges correct. Otherwise - even with svn:mergeinfo - it does not get the merges.
Assumption:It seems that git svn handles trunk and branches in a special way, but Subversion actually does not have branches. In Subversion there are just directories and files, and a branch/tag is just a copy of another directory+revision and merges can happen between any directories independent if they have related ancestry or not.
Workaround:As git svn does not recognize all merges correctly (especially cross-branch copies) those lost links must be added manually. I wrote a small GNU awk script (attached) to determine all cross-branch copies from an `svnadmin dump`. This way the first revision when something got copied is known. Running git svn just to that revision, then fixing the parents and continuing with git svn up the next revision. This helps git svn to find the correct ancestors and correctly build the follow merges. The parents can be changed with `git replace -f --graft <commit> <existing correct parents> <additional correct parents>`. Additionally before continuing with git svn this `git replace` change can be made permanent with `git-filter-repo --force --replace-refs delete-no-add`.
Any help is appreciated, thanks in advance Matthias Bücher [Links] [1] http://svnbook.red-bean.com/en/1.8/svn.advanced.vendorbr.html#svn.advanced.vendorbr.mirrored-sources [2] https://git-scm.com/docs/git-svn[3] https://git-scm.com/book/en/v2/Git-and-Other-Systems-Migrating-to-Git[System Info] git version: git version 2.29.2.windows.3 cpu: x86_64 built from commit: d054eb1fc46ff23e7c95756a7c747e2f2864b478 sizeof-long: 4 sizeof-size_t: 8 shell-path: /bin/sh uname: Windows 10.0 19042 compiler info: gnuc: 10.2 libc info: no libc information available$SHELL (typically, interactive shell): C:\Program Files\Git\usr\bin\bash.exe[Enabled Hooks] none [Commits in Subversion] A Subversion repository dump is attached, plus a test where I recreated the same history directly in a git repository without an issue. * trunk: a------------d--e--h--i--j--k * / / * vendor: a (empty)--b-----f * ^ ^ * tags: c g * Vendor releases: b, f * Custom modifications: e, i, j, k * Tags: really just used as tags, although Subversion internally they are branches. Therefore wanting to create lightweight git tags, although annotated git tags would be fine too. [Expected Subversion to git repo conversion] * trunk => branch "main" * branches/* => branch "*" * tags/* => tag "*" * vendor/current => branch "vendor/current" (can be renamed later to just "vendor") * vendor/* => tag "vendor/*" (except for vendor/current) Expected "tags": vendor/5.4 = rev b (maybe c when annotated) vendor/5.8 = rev f (maybe g when annotated) [Wrong Results] Merges from "vendor/current" branch to trunk get lost. Tags are referencing the predecessors of the expected commit. [Procedure] ``` ### a) preparation cd /d/Coding # cd dd-formmailer svn log --xml --quiet | grep author | sort -u | perl -pe 's/.*>(.*?)<.*/$1 = /' > authors-transform.txt ## edit authors-transform.txt accordingly ### b) git-svn adapted from ProGit book, but with "svn/" prefix cd /d/Coding git svn init --stdlayout --no-metadata --prefix="svn/" -- 'svn+ssh://svn@vcs/dd-formmailer' dd-formmailer-git cd dd-formmailer-git ## edit .git/config for additional vendor branches and tags << __EOF ... [svn-remote "svn"] ... fetch = trunk:refs/remotes/svn/trunk # branches = vendor/current:refs/remotes/svn/vendor/current ## non-glob definition not working for branches branches = vendor/{current}:refs/remotes/svn/vendor/* branches = branches/*:refs/remotes/svn/* tags = vendor/*:refs/remotes/svn/tags/vendor/* tags = tags/*:refs/remotes/svn/tags/* __EOF ## cat .git/config git svn fetch --authors-file ../dd-formmailer/authors-transform.txt # git branch -vv --list ; git for-each-ref gitk --all & ``` [Log] $ cat .git/config [core] repositoryformatversion = 0 filemode = false bare = false logallrefupdates = true symlinks = false ignorecase = true [svn-remote "svn"] noMetadata = 1 url = svn+ssh://svn@vcs/dd-formmailer fetch = trunk:refs/remotes/svn/trunk branches = vendor/{current}:refs/remotes/svn/vendor/* branches = branches/*:refs/remotes/svn/* tags = vendor/*:refs/remotes/svn/tags/vendor/* tags = tags/*:refs/remotes/svn/tags/* $ git svn fetch --authors-file ../dd-formmailer/authors-transform.txtr1 = 158d53044a5628897379403647a19ea13594b532 (refs/remotes/svn/vendor/current)A _svn__guideline.txt A _svn_client_config.txt A _svn_dir_ignore_list.txt r1 = a795b654edbc296e6f38da398f032a4851fd0a9e (refs/remotes/svn/trunk) A dd-formmailer.css A dd-formmailer.php A lang/BrazilianPortuguese.php A lang/Catalan.php A lang/Danish.php A lang/Deutsch.php A lang/Dutch.php A lang/English.php A lang/Finnish.php A lang/French.php A lang/Greek.php A lang/Italian.php A lang/NorwegianBokmaal.php A lang/Polish.php A lang/Portuguese.php A lang/Romanian.php A lang/Russian.php A lang/Slovak.php A lang/Slovene.php A lang/Spanish.php A lang/Swedish.php A lang/Turkish.php A recaptchalib.phpr2 = 8fd8668dbed2dde6b55306c71b3b629f5ed794ec (refs/remotes/svn/vendor/current)Found possible branch point: svn+ssh://svn@vcs/dd-formmailer/vendor/current => svn+ssh://svn@vcs/dd-formmailer/vendor/5.4, 1 Found branch parent: (refs/remotes/svn/tags/vendor/5.4) 158d53044a5628897379403647a19ea13594b532 Following parent with do_switch A dd-formmailer.css A dd-formmailer.php A lang/BrazilianPortuguese.php A lang/Catalan.php A lang/Danish.php A lang/Deutsch.php A lang/Dutch.php A lang/English.php A lang/Finnish.php A lang/French.php A lang/Greek.php A lang/Italian.php A lang/NorwegianBokmaal.php A lang/Polish.php A lang/Portuguese.php A lang/Romanian.php A lang/Russian.php A lang/Slovak.php A lang/Slovene.php A lang/Spanish.php A lang/Swedish.php A lang/Turkish.php A recaptchalib.php Successfully followed parentr3 = 7ab4f436cafc8af3ed2e727a6d8cbef1a8f8b39f (refs/remotes/svn/tags/vendor/5.4)A dd-formmailer.css A dd-formmailer.php A lang/BrazilianPortuguese.php A lang/Catalan.php A lang/Danish.php A lang/Deutsch.php A lang/Dutch.php A lang/English.php A lang/Finnish.php A lang/French.php A lang/Greek.php A lang/Italian.php A lang/NorwegianBokmaal.php A lang/Polish.php A lang/Portuguese.php A lang/Romanian.php A lang/Russian.php A lang/Slovak.php A lang/Slovene.php A lang/Spanish.php A lang/Swedish.php A lang/Turkish.php A recaptchalib.php r4 = 7ec3663fd8e7c9fcfeb2742968a948d6978776d4 (refs/remotes/svn/trunk) M dd-formmailer.css M dd-formmailer.php A dd-verify.php r5 = e65ba2bdcd12a1935c1a327507dad6b7117f452b (refs/remotes/svn/trunk) A calendar.gif A date_chooser.js M dd-formmailer.css M dd-formmailer.php A lang/Belarussian.php A lang/Czech.php A lang/Estonian.php A lang/Japanese.php A lang/Vietnamese.php M recaptchalib.phpr6 = 9f95df7ce49c5d1cb4715017d223a8bd1c8dcffc (refs/remotes/svn/vendor/current)Found possible branch point: svn+ssh://svn@vcs/dd-formmailer/vendor/current => svn+ssh://svn@vcs/dd-formmailer/vendor/5.8, 3 Found branch parent: (refs/remotes/svn/tags/vendor/5.8) 8fd8668dbed2dde6b55306c71b3b629f5ed794ec Following parent with do_switch A calendar.gif A date_chooser.js M dd-formmailer.css M dd-formmailer.php A lang/Belarussian.php A lang/Czech.php A lang/Estonian.php A lang/Japanese.php A lang/Vietnamese.php M recaptchalib.php Successfully followed parentr7 = cc00cb187386298cf974dda69f151d2ad4795917 (refs/remotes/svn/tags/vendor/5.8)A calendar.gif A date_chooser.js M dd-formmailer.css M dd-formmailer.php M dd-verify.php A lang/Belarussian.php A lang/Czech.php A lang/Estonian.php A lang/Japanese.php A lang/Vietnamese.php M recaptchalib.php Checking svn:mergeinfo changes since r5: 1 sources, 1 changed W: Cannot find common ancestor between e65ba2bdcd12a1935c1a327507dad6b7117f452b and 9f95df7ce49c5d1cb4715017d223a8bd1c8dcffc. Ignoring merge info. r8 = 9a5cd8f55f377f469280171cd87219b8f528c693 (refs/remotes/svn/trunk) M dd-formmailer.php r9 = 8143782376d9fbc91a9181b367b72701205b017f (refs/remotes/svn/trunk) M dd-formmailer.php r10 = 6f563fc4c511ddcc53c2ffc5d26c1e116725bc6b (refs/remotes/svn/trunk) M _svn_client_config.txt r11 = e9130854178d2c2743a981f303cdd7f34e54b052 (refs/remotes/svn/trunk) svnserve: E210002: Network connection closed unexpectedly Checked out HEAD: svn+ssh://svn@vcs/dd-formmailer/trunk r11 $ git branch -vv --list ; git for-each-ref * main e913085 Updated client config for Windows Scripts e9130854178d2c2743a981f303cdd7f34e54b052 commit refs/heads/main7ab4f436cafc8af3ed2e727a6d8cbef1a8f8b39f commit refs/remotes/svn/tags/vendor/5.4 cc00cb187386298cf974dda69f151d2ad4795917 commit refs/remotes/svn/tags/vendor/5.8e9130854178d2c2743a981f303cdd7f34e54b052 commit refs/remotes/svn/trunk9f95df7ce49c5d1cb4715017d223a8bd1c8dcffc commit refs/remotes/svn/vendor/current
Attachment:
dd-formmailer-mergeinfo.svndump.7z
Description: application/7z-compressed
Attachment:
dd-formmailer-branched.svndump.7z
Description: application/7z-compressed
#!/bin/gawk -f ### GNU awk: https://www.gnu.org/software/gawk/manual/ ### analyze-svn-dump-for-cross-branch-copies.sh ### ### Parameters: ### * csv=";" (or ",") - export main information as CSV ### * details=1 - see all copied pathes ### ### Copyright (C) 2021 Matthias Bücher, Germany <maddes@xxxxxxxxxx> ### ### This program is free software: you can redistribute it and/or modify ### it under the terms of the GNU General Public License as published by ### the Free Software Foundation, either version 3 of the License, or ### (at your option) any later version. ### ### This program is distributed in the hope that it will be useful, ### but WITHOUT ANY WARRANTY; without even the implied warranty of ### MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ### GNU General Public License for more details. ### ### You should have received a copy of the GNU General Public License ### along with this program. If not, see <https://www.gnu.org/licenses/>. ### ATTENTION! function code has to be adapted to the structure of the repository and its historical changes function getBranchOfNodePath(nodepath, branch) { branch = "" ### --- A) special cases ## /vendor/* if (match(nodepath, /^vendor\/[^/]*/)) { branch = substr(nodepath, RSTART, RLENGTH) } ## /tags/vendor/* else if (match(nodepath, /^tags\/vendor\/[^/]*/)) { branch = substr(nodepath, RSTART, RLENGTH) } ### --- B) standard layout: /branches/*, /tags/*, /trunk else if (match(nodepath, /^branches\/[^/]*/)) { branch = substr(nodepath, RSTART, RLENGTH) } else if (match(nodepath, /^tags\/[^/]*/)) { branch = substr(nodepath, RSTART, RLENGTH) } else if (match(nodepath, /^trunk\//)) { branch = substr(nodepath, RSTART, RLENGTH-1) } ### --- C) fallback: remove last component else { branch = gensub(/\/[^/]+$/, "", "", nodepath) } # return branch } BEGIN { FS = "\n" } BEGINFILE { ## initialize variables revision = "" nodepath = "" nodecopyrev = "" nodecopypath = "" ## initialize arrays delete cross_branch_copies delete cross_branch_copies_first } /^Revision-number: / { revision = $0 sub(/^Revision-number: /, "", revision) sub(/^\s+/, "", revision) sub(/\s+$/, "", revision) # nodepath = "" nodecopyrev = "" nodecopypath = "" } /^Node-path: / { nodepath = $0 sub(/^Node-path: /, "", nodepath) # nodecopyrev = "" nodecopypath = "" } /^Node-copyfrom-rev: / { nodecopyrev = $0 sub(/^Node-copyfrom-rev: /, "", nodecopyrev) } /^Node-copyfrom-path: / { nodecopypath = $0 sub(/^Node-copyfrom-path: /, "", nodecopypath) # branchfrom = getBranchOfNodePath(nodecopypath) branchto = getBranchOfNodePath(nodepath) # if (branchfrom != branchto) { cross_branch_copies[revision][branchfrom][branchto][nodepath]["nodecopypath"] = nodecopypath cross_branch_copies[revision][branchfrom][branchto][nodepath]["nodecopyrev"] = nodecopyrev if (!((branchfrom, branchto) in cross_branch_copies_first)) { cross_branch_copies_first[branchfrom, branchto] = revision } } } ENDFILE { foundrevs = length(cross_branch_copies) if (foundrevs == 0) { printf("=== %s: No revisions found with cross-branch svn copies\n", FILENAME) } else { if (csv) { backupofs=OFS } printf("=== %s: Found %i revisions with cross-branch svn copies\n", FILENAME, foundrevs) if (csv) { OFS=csv print("\"Revision\"", "\"Branch from\"", "\"Branch to\"") OFS=backupofs } PROCINFO["sorted_in"] = "@ind_num_asc" for (revision in cross_branch_copies) { if (!(csv)) { print(">>> Revision:", revision) } count = 0 PROCINFO["sorted_in"] = "@ind_str_asc" for (branchfrom in cross_branch_copies[revision]) { for (branchto in cross_branch_copies[revision][branchfrom]) { if (csv) { csvrevision = revision csvbranchfrom = "\"" gensub(/"/, "\"\"", "g", branchfrom) "\"" csvbranchto = "\"" gensub(/"/, "\"\"", "g", branchto) "\"" OFS=csv print(csvrevision, csvbranchfrom, csvbranchto) OFS=backupofs } else { printf(" svn copy from \"%s\" to \"%s\"\n", branchfrom, branchto) if (details) { for (nodepath in cross_branch_copies[revision][branchfrom][branchto]) { count++ printf(" %4i. \"%s\" (Revision %i) to \"%s\"\n", count, cross_branch_copies[revision][branchfrom][branchto][nodepath]["nodecopypath"], cross_branch_copies[revision][branchfrom][branchto][nodepath]["nodecopyrev"], nodepath) } ## nodepath } ## details } ## csv } ## branchto } ## branchfrom } ## revision # printf("--- %s: List of first revision of each cross-branch copy\n", FILENAME) PROCINFO["sorted_in"] = "@val_num_asc" if (csv) { OFS=csv print("\"Revision\"", "\"Branch from\"", "\"Branch to\"") OFS=backupofs } for (combined in cross_branch_copies_first) { ## combined: branchfrom, branchto split(combined, separate, SUBSEP) if (csv) { csvrevision = cross_branch_copies_first[combined] csvbranchfrom = "\"" gensub(/"/, "\"\"", "g", separate[1]) "\"" csvbranchto = "\"" gensub(/"/, "\"\"", "g", separate[2]) "\"" OFS=csv print(csvrevision, csvbranchfrom, csvbranchto) OFS=backupofs } else { printf(" svn copy from \"%s\" to \"%s\" first in revision %i\n", separate[1], separate[2], cross_branch_copies_first[combined]) } ## csv } ## combined # count = length(cross_branch_copies_first) printf("^^^ %s: Found %i revisions (unique %i) with cross-branch svn copies\n", FILENAME, foundrevs, count) } ## foundrevs }