Re: parsecvs tool now creates git repositories

Keith Packard <keithp@xxxxxxxxxx> · Mon, 03 Apr 2006 20:51:49 -0700

On Tue, 2006-04-04 at 14:42 +1200, Martin Langhoff wrote:

> Cool. What's the matter with the Pg repo? (Where can I get hold of that repo?)

As usual, the detection of branch locations is messed up.

The postgresql CVS tree is available at:

        rsync anoncvs.postgresql.org::pgsql-cvs/* postgresql.cvs

It's a fairly hefty 300M.

> > > Does it run incrementally? Can it discover non-binary files and pass -kk?
> >
> > It doesn't run incrementally, and it unconditionally passes -kk. It's
> 
> I thought that the .git-cvs directory it created was to be able to run
> incrementally (btw, I think it's fair game to create subdirs inside
> .git for this kind of status-tracking). And passing -kk uncoditionally
> is destructive in some cases (I know... git-cvsimport does it, and I
> want to fix that). If you can ask rcs about the mode if the file and
> not pass -kk for binary files...

nah, the .git-cvs directory is purely for debugging; I leave the various
command outputs there so I can see what went wrong.

I don't really have a good idea of how we'd do this process
incrementally; that's not something I am personally interested in
either, I want to run screaming from CVS as fast as I can at this point.

> > currently using rcs to check out versions of the files, so it should
> > deal with binary content as well as rcs does. Is there something magic I
> > need to do here? Like for DOS?
> 
> We'll let DOS take care of itself ;)

I did discover that rcs has less sophisticated keyword substitution than
cvs; not having any ability to customize stuff.

I guess we need to figure out when to pass -ko and when to pass -kk. The
other alternative I'd like to get around to trying is to directly
generate all of the revision contents from the ,v file.

I've just changed parsecvs to generate blobs for every revision in
each ,v file right after they're read in; putting the necessary code
right into parsecvs should be reasonably straightforward; we don't need
the multi-patch logic as we do want to compute each intermediate version
of the file.

With the blobs all generated, the rest of the operation is a simple
matter of building suitable indices and creating commits out of them.
That's a reasonably fast operation now as it doesn't manipulate any file
contents. Plus, I can do all of the index operations using a single
git-update-index command, so I eliminate a pile of forking.

Doing the file revision generation in-line would allow us to eliminate
most of the remaining forks; we'd run one git-hash-object per file (or
so), then a git-update-index, git-write-tree and git-commit-tree per
resulting commit.

-- 
keith.packard@xxxxxxxxx
Attachment:
signature.asc

Description: This is a digitally signed message part