hi all,
there's a major feature for working with large binaries that has not
yet been addressed by git: the ability to check out a file as a
symbolic/hard link to a blob in the repository, instead of duplicating
the file into the working copy.
imagine a scenario where one user is putting large binary files into a
git repo on a networked server. 100 other users on the server need
read-only access to this repo. they clone the repo using --shared or
--local, which saves disk space for the object files, but each of
these 100 working copies also creates copies of all the binary files
at the HEAD revision. it would be 100x as efficient in both disk space
and checkout speeds if, in place of these files, symbolic or hard
links were made to the blob files in .git/objects.
the crux of the issue is that the blob objects would have to be stored
as exact copies of the original files. it would seem there are two
things that currently prevent this from happening. 1) blobs are
stored with compression and 2) they include a small header.
compression can be disabled by setting core.loosecompression to 0, so
that seems like less of an issue. as for the header, wouldn't it be
possible to store it separately? in other words, store two files per
blob directory, a small stub file with the header info and the
unaltered file data.
what are the caveats to a system like this? has anyone looked into
this before?
-chad
p.s.
i tried submitting a post through nabble a few days and it said that
it was still pending, so i thought i'd try submitting directly to the
mailing list. sorry, if i end up double-posting
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html