Re: I have end-of-lifed cvsps

"Eric S. Raymond" <esr@xxxxxxxxxxx> · Tue, 17 Dec 2013 13:47:24 -0500

Johan Herland <johan@xxxxxxxxxxx>:
> However, I fear that you underestimate the number of users that want
> to use Git against CVS repos that are orders of magnitude larger (in
> both dimensions: #commits and #files) than your example repo.

You may be right. See below...

I'm working with Alan Barret now on trying to convert the NetBSD
repositories. They break cvs-fast-export through sheer bulk of
metadata, by running the machine out of core.  This is exactly
the kind of huge case that you're talking about.

Alan and I are going to take a good hard whack at modifying cvs-fast-export 
to make this work. Because there really aren't any feasible alternatives.
The analysis code in cvsps was never good enough. cvs2git, being written
in Python, would hit the core limit faster than anything written in C.

> Although a full-history converter with fairly stable output can be
> made to support this second problem for repos up to a certain size,
> there will probably still be users that want to work incrementally
> against much bigger repos, and I don't think _any_
> full-history-gone-incremental importer will be able to support the
> biggest repos.
> 
> Consequently I believe that for these big repos it is _impossible_ to
> get both fast incremental workflows and a high degree of (historical)
> correctness.
> 
> cvsps tried to be all of the above, and failed badly at the
> correctness criteria. Therefore I support your decision to "shoot it
> through the head". I certainly also support any work towards making a
> full-history converter work in an incremental manner, as it will be
> immensely useful for smaller CVS repos. But at the same time we should
> realize that it won't be a solution for incrementally working against
> _large_ CVS repos.

It is certainly the case that a sufficiently large CVS repo will break
anything, like a star with a mass over the Chandrasekhar limit becoming a 
black hole :-)

The question is how common such supermassive cases are. My own guess is that
the *BSD repos and a handful of the oldest GNU projects are pretty much the
whole set; everybody else converted to Subversion within the last decade. 

> Although it should have been made obvious a long time ago, the removal
> of cvsps has now made it abundantly clear that Git currently provides
> no way to support the incremental workflow against large CVS repos.
> Maybe that is ok, and we can ignore that, waiting for the few
> remaining large CVS repos to die? Or maybe we need a new effort to
> fill this niche? Something that is NOT based on a full-history
> converter, and does NOT try to guarantee a history-correct conversion,
> but that DOES try to guarantee fast and relatively worry-free two-way
> synchronization against a CVS server. Unfortunately (or fortunately,
> depending on POV) I have not had to touch CVS in a long while, and I
> don't see that changing soon, so it is not my itch to scratch.

Nor mine.  I find the very idea of writing anything that encourages
non-history-correct conversions disturbing and want no part of it.

Which matters, because right now the set of people working on CVS lifters
begins with me and ends with Michael Rafferty (cvs2git), who seems even
less interested in incremental conversion than I am.  Unless somebody
comes out of nowhere and wants to own that problem, it's not going
to get solved.
-- 
		<a href="http://www.catb.org/~esr/";>Eric S. Raymond</a>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html