David Mansfield wrote: > The design goal of cvsps was always simply to show who did what and in > what chronological order. However, just with that, it's impossible to > use for the purpose it is currently being used for. Good point. I re-read the cvsps manpage and found the information about "FUNKY" and "INVALID" tags. I'd forgotten that cvsps does the right thing in some cases by warning the user about tags that are beyond its abilities to describe. (But there are other problems that cvsps doesn't warn about; see below.) Then I looked in the git-cvsimport code to see how it deals with FUNKY and INVALID tags. It does the *wrong* thing by explicitly ignoring these warnings (!). IMHO git-cvsimport should notice the **FUNKY** and **INVALID** annotations and at least output a warning to the user that the associated tags may not have been converted correctly. But cvsps makes some other symbol-related mistakes, presumably in the name of simplification. These problems make it impossible for git-cvsimport to generate accurate branches and tags, even if it were to use fixup branches internally. Moreover, many of these are silent failures; there is no way that git-cvsimport could even determine that the cvsps output is inadequate. For example, if I understand correctly: - cvsps pretends that a tag or branch is applied to a single snapshot of the repository on a single branch, even though in reality: - some files might have been left out of the tag/branch (cvsps doesn't give any indication if this was the case). If this tag/branch is checked out, the files that were not tagged are erroneously included. - the revisions not being tagged might not have all existed contemporaneously (cvsps indicates these cases by marking the tags **FUNKY** or **INVALID**). - a tag can be applied to different files on different branches; e.g., a tag can contain file1:1.3 (from trunk) and file2:1.2.2.1 (from some other branch). cvsps seems to pick one branch as source without indicating a problem. The inevitable result in cvsps is that the tag includes the wrong contents for some files with no way to detect the error. - If there is no commit on a branch, cvsps ignores the branch entirely. (Maybe this is fixed by your recent patch?) - If there are multiple tags applied to the same set of file revisions (for example, a daily tag and a release tag), cvsps silently ignores all but one of them. This causes unavoidable data loss in git-cvsimport. There are lots of more complicated scenarios that I haven't tested against cvsps... Granted, cvsps was not written to be usable for converters. But regardless of whether the output is being read by a human or by another program, its output can be wrong, and there is often no way to tell from the output that it had a problem. Maybe cvsps could emit warning annotations in more of the situations that it punts on, and git-cvsimport could pass these warnings along to the end user? Otherwise people will believe that git-cvsimport is converting their repository accurately when in fact it often silently produces incorrect output. > The place where the fixup branch logic needs to be is in git-cvsimport, > not in cvsps. Better yet, get rid of git-cvsimport and replace it with > cvs2git if it works better. cvs2git hopefully gives a more accurate conversion of a CVS repository -- it handles all of the cases described above, plus many more [1] -- but it is much slower and can't work incrementally. So there is definitely still demand for something like git-cvsimport. Michael [1] http://cvs2svn.tigris.org/features.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html