On 6/22/06, Martin Langhoff <martin.langhoff@xxxxxxxxx> wrote:
On 6/23/06, Jon Smirl <jonsmirl@xxxxxxxxx> wrote: > cvsps keeps it's incremental status in ~/.cvps/*. parsecvs might want > to keep it's status in the .git repository and use tags to locate it. > You could even have a utility to show when and what was imported. By > keeping everything in git it doesn't matter who runs the incremental > update commands. Jon, what cvsps keeps is a cache of what it knows about the repo history, to ask only for new commits. Now, cvsps will always write to STDOUT the full history, and git-cvsimport discards the commits it has already seen, based on reading the state of each git head.
The cache is 723MB for the Mozilla repo. Since the info gets cached in my home directory anyone else who needs to sync the repo doesn't get to use the cache. [jonsmirl@jonsmirl .cvsps]$ pwd /home/jonsmirl/.cvsps [jonsmirl@jonsmirl .cvsps]$ ls -l total 707492 -rw-rw-r-- 1 jonsmirl jonsmirl 723758657 Jun 15 16:10 #home#mozcvs##mozilla [jonsmirl@jonsmirl .cvsps]$ Keith is rewriting parsecvs. If you analyze all of the data structures, the info needed for the conversion should be able to fit into well under 100MB instead of the ~2GB the current programs are using. There are lots of ways to reduce memory consumption. You can turm CVS revisions into git IDs as soon as the revision is seen. That lets you get away from tracking file names and long CVS revision numbers. It also works to turn the author/log fields immediately into a hash. When possible switching to arrays instead of linked list is smaller too. Some stats: 1M revisions 200K unique changesets (author/log combos) 200KB symbols 1,800 branches cvsps has the lowest memory consumption, it uses 1200 bytes per revision. It looks like it is possible to lower this to less than 100 bytes per rev.
So cvsps + git-cvsimport don't keep any extra data around, and I am 100% certain that parsecvs don't need that either. cheers, martin
-- Jon Smirl jonsmirl@xxxxxxxxx - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html