Hi, it happened to me to read an older post by Jeff King about "multiblobs" (http://kerneltrap.org/mailarchive/git/2008/4/6/1360014) and I was wandering whether the idea has been abandoned for some reason or just put on hold. Apparently, this would marvellously help on - storing large binary blobs (the split could happen with a rolling checksum approach) - storing "structured files", such as the many zip-based file formats (Opendocument, Docx, Jar files, zip files themselves), tars (including compressed tars), pdfs, etc, whose number is rising day after day... - storing binary files with textual tags, where the tags could go on a separate blob, greatly simplifying their readout without any need for caching them on a note tree. - etc... Furthermore, this could also - help the management of upstream trees. This could be simplified since the "pristine tree" distributed as a tar.gz file and the exploded repo could share their blobs making commands such as pristine-tree unnecessary. - help projects such as bup that currently need to provide split mechanisms of their own. - be used to add "different representations" to objects... for instance, when storing a pdf one could use a fake split to store in a separate blob the corresponding text, making the git-diff of pdfs almost instantaneous. >From Jeff's post, I guess that the major issue could be that the same file could get a different sha1 as a multiblob versus a regular blob, but maybe it could be possible to make the multiblob take the same sha1 of the "equivalent plain blob" rather than its real hash. For the moment, I am just very curious about the idea and the possible pros and cons... can someone (maybe Jeff himself) tell me a little more? Also I wonder about the two possibilities (implement it in git vs implement it "on top of" git). Sergio -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html