Johan Herland <johan@xxxxxxxxxxx>: > However, I fear that you underestimate the number of users that want > to use Git against CVS repos that are orders of magnitude larger (in > both dimensions: #commits and #files) than your example repo. You may be right. See below... I'm working with Alan Barret now on trying to convert the NetBSD repositories. They break cvs-fast-export through sheer bulk of metadata, by running the machine out of core. This is exactly the kind of huge case that you're talking about. Alan and I are going to take a good hard whack at modifying cvs-fast-export to make this work. Because there really aren't any feasible alternatives. The analysis code in cvsps was never good enough. cvs2git, being written in Python, would hit the core limit faster than anything written in C. > Although a full-history converter with fairly stable output can be > made to support this second problem for repos up to a certain size, > there will probably still be users that want to work incrementally > against much bigger repos, and I don't think _any_ > full-history-gone-incremental importer will be able to support the > biggest repos. > > Consequently I believe that for these big repos it is _impossible_ to > get both fast incremental workflows and a high degree of (historical) > correctness. > > cvsps tried to be all of the above, and failed badly at the > correctness criteria. Therefore I support your decision to "shoot it > through the head". I certainly also support any work towards making a > full-history converter work in an incremental manner, as it will be > immensely useful for smaller CVS repos. But at the same time we should > realize that it won't be a solution for incrementally working against > _large_ CVS repos. It is certainly the case that a sufficiently large CVS repo will break anything, like a star with a mass over the Chandrasekhar limit becoming a black hole :-) The question is how common such supermassive cases are. My own guess is that the *BSD repos and a handful of the oldest GNU projects are pretty much the whole set; everybody else converted to Subversion within the last decade. > Although it should have been made obvious a long time ago, the removal > of cvsps has now made it abundantly clear that Git currently provides > no way to support the incremental workflow against large CVS repos. > Maybe that is ok, and we can ignore that, waiting for the few > remaining large CVS repos to die? Or maybe we need a new effort to > fill this niche? Something that is NOT based on a full-history > converter, and does NOT try to guarantee a history-correct conversion, > but that DOES try to guarantee fast and relatively worry-free two-way > synchronization against a CVS server. Unfortunately (or fortunately, > depending on POV) I have not had to touch CVS in a long while, and I > don't see that changing soon, so it is not my itch to scratch. Nor mine. I find the very idea of writing anything that encourages non-history-correct conversions disturbing and want no part of it. Which matters, because right now the set of people working on CVS lifters begins with me and ends with Michael Rafferty (cvs2git), who seems even less interested in incremental conversion than I am. Unless somebody comes out of nowhere and wants to own that problem, it's not going to get solved. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html