On Fri, 24 Feb 2006, Carl Baldwin wrote: > Junio, > > This message came to me at exactly the right time. Yesterday I was > exploring using git as the content storage back-end for some binary > files. Up until now I've only used it for software projects. > > I found the largest RCS file that we had in our current back-end. It > contained twelve versions of a binary file. Each version averaged about > 20 MB. The ,v file from RCS was about 250MB. I did some experiments on > these binary files. > > First, gzip consistantly is able to compress these files to about 10% > their original size. So, they are quite inflated. Second, xdelta would > produce a delta between two neighboring revisions of about 2.5MB in size > that would compress down to about 2MB. (about the same size as the next > revision compressed without deltification so packing is ineffective > here). > > I added these 12 revisions to several version control back-ends > including subversion and git. Git produced a much smaller repository > size than the others simply due to the compression that it applies to > objects. It also was at least as fast as the others. > > The problem came when I tried to clone this repository. > git-pack-objects chewed on these 12 revisions for over an hour before I > finally interrupted it. As far as I could tell, it hadn't made much > progress. I must ask if you had applied my latest delta patches? Also did you use a recent version of git that implements pack data reuse? Nicolas - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html