Greetings, I am attempting to improve CVS -> CVSps -> Git-cvsimport process. The part involving Git-cvsimport has to do with parsing of CVSps PatchSet file. Consider what happens if a CVS log/commit message includes lines which start with "Members:", say from copy-and-paste [2]. To avoid this issue, I have proposed that CVSps append the "Log:" tag with line count of original CVS log/commit message [1]. The idea is if line-count is found after "Log:", that many (CVS log message) lines get consumed before advancing $state to look for "^Members:" Current Git-cvsimport isn't strict in matching the "Log:" tag (fortunately) and my proposed change to Git-cvsimport should be fully backward compatible. See attached patch. Cheers, --patrick p.s., For reference: Why I'm doing this and RFC sent to CVS list: http://lists.nongnu.org/archive/html/info-cvs/2017-11/msg00000.html [1] https://github.com/andreyvit/cvsps/pull/4 [2] Example PatchSet with "Members:" line in original CVS commit message: --------------------- PatchSet 3 Date: 2017/10/30 23:25:20 Author: catbert Branch: HEAD Tag: (none) Log: This will confuse git-cvsimport's parser Members: somefile.c:1.1->1.2 another.h:1.7->1.8 foo.mk:1.22->1.23 Imagine these were lines pasted to note something Members: ABC:1.1->1.2
commit c3e406c54b8cd3a2bbf0aa729fef201e20fa6df5 Author: patrick keshishian <pkeshish@xxxxxxxxx> Date: Sat Nov 4 08:42:12 2017 -0700 Optionally parse line count out of PatchSets with "Log: count" This is a change being suggested to CVSps where the line count of the commit message gets added to the "Log:" tag to help Git cvsimport not get confused if the CVS log/commit message included lines starting with any of the tags found in CVSps PatchSet, e.g., Members: This is part of a larger change to make CVS to Git import more robust. diff --git a/git-cvsimport.perl b/git-cvsimport.perl index 36929921e..5d78c5e87 100755 --- a/git-cvsimport.perl +++ b/git-cvsimport.perl @@ -786,6 +786,13 @@ open(CVS, "<$cvspsfile") or die $!; # #--------------------- +# NOTE: +## pk, 2017/10/30 +# patched cvsps will output ^Log: line with number of lines of log +# which are to follow. This makes parsing robust for cases where the +# log message contains ^Members: lines! Happens in OpenBSD sources: +# e.g., See src/usr.sbin/bgpd/rde.c + my $state = 0; sub update_index (\@\@) { @@ -816,7 +823,7 @@ sub write_tree () { return $tree; } -my ($patchset,$date,$author_name,$author_email,$author_tz,$branch,$ancestor,$tag,$logmsg); +my ($patchset,$date,$author_name,$author_email,$author_tz,$branch,$ancestor,$tag,$logmsg,$loglines); my (@old,@new,@skipped,%ignorebranch,@commit_revisions); # commits that cvsps cannot place anywhere... @@ -1005,8 +1012,13 @@ while (<CVS>) { $tag = $_; } $state = 7; - } elsif ($state == 7 and /^Log:/) { + } elsif ($state == 7 and /^Log:\s*(\d+)?$/) { + $loglines = $1 // -1; $logmsg = ""; + while ($loglines-- > 0 && ($_ = <CVS>)) { + chomp; + $logmsg .= "$_\n"; + } $state = 8; } elsif ($state == 8 and /^Members:/) { $branch = $opt_o if $branch eq "HEAD";