I just imported an svn repository with about 120 tags and 140 branches, and with some repacking got the pack file down to a comfortable 80 MB. However, .git is over 600 MB, owing to about 520 MB of git-svn metadata. (This wasn't a problem when I only tracked a handful of branches, since they're only a few megs apiece.) There appears to be two kinds of metadata that takes up a significant fraction of the space. * An index file is saved for each branch and tag. I presume this corresponds to the branch head, and is used to speed up importing of new revisions to that branch. However, recreating an index with git-read-tree is very fast, so I don't think these need to be saved between git-svn runs. * A "rev_db" file is saved for each branch and tag. This is a text file with one sha1 per line -- I seem to remember that line X of this file is the commit sha1 of svn revision X. For revisions that didn't touch this branch/tag, there's a line of 40 zeros. And since every revision touches just one branch, it's almost all zeros unless the number of branches is very small. This could probably be stored _much_ more efficiently. Just gzipping it with the standard options shrinks it by between a factor of 4 (for one of the busiest branches) and 300 (for a tag, which is written just once). But I understand that we need quick random access here? The index files should be easy enough to erase between runs, if they indeed just correspond to the branch head. The rev_db files are trickier; exactly what kind of lookups are required? Could it perhaps be done with just one file, instead of one per branch/tag? -- Karl Hasselström, kha@xxxxxxxxxxx www.treskal.com/kalle - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html