On Thu, Mar 10, 2011 at 10:02:53PM +0100, Alexander Miseler wrote: > I've been debating whether to resurrect this thread, but since it has > been referenced by the SoC2011Ideas wiki article I will just go ahead. > I've spent a few hours trying to make this work to make git with big > files usable under Windows. > > > Just a quick aside. Since (a2b665d, 2011-01-05) you can provide > > the filename as an argument to the filter script: > > > > git config --global filter.huge.clean huge-clean %f > > > > then use it in place: > > > > $ cat >huge-clean > > #!/bin/sh > > f="$1" > > echo orig file is "$f" >&2 > > sha1=`sha1sum "$f" | cut -d' ' -f1` > > cp "$f" /tmp/big_storage/$sha1 > > rm -f "$f" > > echo $sha1 > > > > -- Pete After thinking about this strategy more (the "convert big binary files into a hash via clean/smudge filter" strategy), it feels like a hack. That is, I don't see any reason that git can't give you the equivalent behavior without having to resort to bolted-on scripts. For example, with this strategy you are giving up meaningful diffs in favor of just showing a diff of the hashes. But git _already_ can do this for binary diffs. The problem is that git unnecessarily uses a bunch of memory to come up with that answer because of assumptions in the diff code. So we should be fixing those assumptions. Any place that this smudge/clean filter solution could avoid looking at the blobs, we should be able to do the same inside git. Of course that leaves the storage question; Scott's git-media script has pluggable storage that is backed by http, s3, or whatever. But again, that is a feature that might be worth putting into git (even if it is just a pluggable script at the object-db level). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html