On 05/17/2013 01:50 PM, Martin Langhoff wrote: > On Fri, May 17, 2013 at 5:10 AM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote: >> For one-time imports, the fix is to use a tool that is not broken, like >> cvs2git. > > As one of the earlier maintainers of cvsimport, I do believe that > cvs2git is less broken, for one-shot imports, than cvsimport. Users > looking for a one-shot import should not use cvsimport as there are > better options there. Myself, I have used parsecvs (long ago, so > perhaps it isn't the best of the crop nowadays). > > TBH, I am puzzled and amused at all the chest-thumping about cvs > importers. Yeah, yours is a bit better or saner, but we all wade in > the muddle of essentially broken data. So "is not broken" is rather > misleading when talking to end users. It carries so many caveats about > whether it'll work on the users' particular repo that it is not a > generally truthful statement. I disagree. I use the following definition of "correct": The Git history output by an importer must not contradict the history that is recorded in CVS. We both know that the CVS history omits important data, and that the history is mutable, etc. So there are lots of hypothetical histories that do not contradict CVS. But some things are recorded unambiguously in the CVS history, like * The contents at any tag or the tip of any branch (i.e., what is in the working tree when you check it out). * The order of modifications to a single file on a single branch and the file contents after each of those revisions. * Who committed a particular change, and approximately when (modulo clock skew). If a tool doesn't get these things correct (especially the first!) then it should only be used with great caution. cvsimport can make mistakes on the first two. As far as I know, cvs2svn/cvs2git are correct according to this definition. That being said, I appreciate that cvsimport can do incremental imports. cvs2git doesn't even attempt it. I've thought about what it would take to implement correct incremental imports in cvs2svn/cvs2git, and it is far beyond the budget of time that I have for the project. So I definitely give props to cvsimport for attempting incremental imports and apparently often doing a good enough job that it is useful to people. > [...] > At the time, I looked into trying to use cvs2svn (precursor to > cvs2git) as the "CVS read" side of cvsimport, but it did not support > incremental imports at all, and it took forever to run. cvs2svn still doesn't support incremental imports, and it still takes a long time to run (though less than before). cvs2git is considerably faster, partly because of the speed and convenience of using git-fast-import. But conversion time is much less of an issue for one-time conversions. > It was a time when git was new and people were dipping their toes in > the pool, and some developers were pining to use git on projects that > used CVS (like we use git-svn now). Incremental imports were a must. > > One of the nice features of cvsimport is that it can do incrementals > on a repo imported with another tool. That earns it a place under the > sun. If it didn't have that, I'd be voting for removal (after a review > that the replacement *is* actually better ;-) across a number of test > repos). Incremental imports are indeed the saving grace of cvsimport and for that reason I don't advocate it's removal. But I think we should be clearer about warning users against using it for one-time imports, because it can produce output that is *objectively* incorrect in important ways. Regarding tests, the failing tests that I added to the cvsimport test suite a few years ago were taken directly from the cvs2svn/cvs2git test suite, where they pass :-) Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html