On 6/14/06, Keith Packard <keithp@xxxxxxxxxx> wrote:
cvs rlog is designed to 'represent' the history of the repository to users. Cvsps was built as a software analysis tool, and is used by putative software engineering researchers. Basing a supposedly lossless repository conversion system on this pair seems foolish to me, notwithstanding the heroic efforts to make it work.
Yes, cvsps is relying on the wrong things. I am looking at parsecvs and the cvs2svn tool and wondering where to from here. In terms of history parsing, parsecvs and cvs2svn are similar. I like cvs2svn "many passes" approach better, though the Python source is really messy. A good thing about cvs2svn is that it is a lot more conservative WRT memory use. So far, I have been relying on parsecvs for initial imports, and for cvsps+git-cvsimport for incrementals on top of that initial import. But parsecvs falls over with large repos. I am starting to look at what I can do with cvs2svn to get the import into git. It seems to get very good patchsets, and it yields an easily readable DB. I'll either learn Python, or read the DB from Perl (probably from git-cvsimport). The main problem, however, is that it doesn't do incremental imports, so this would be a roundabout way of fixing parsecvs's memory-bound-ness. We still need cvsps :( martin - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html