On Mon, Sep 24, 2012 at 07:51:20PM +0200, Angelo Borsotti wrote: > #!/bin/bash > > set -v > cd remote > rm -rf * .git/ > git init > echo '*.pdf -crlf -diff merge=binary' >.git/info/attributes > > touch f1 > git add f1 > echo "aaa" >f1.pdf > git add f1.pdf > cp <very large pdf file, some 100 Mbytes>.pdf f2.pdf > git add f2.pdf > git commit -m A > cd .. > > cd local > rm -rf * .git/ > git init > echo '*.pdf -crlf -diff merge=binary' >.git/info/attributes > git remote add remote ../remote > > touch f3 > git add f3 > git commit -m B > git checkout -b develop > > echo "bbb" >f2.pdf > git add f2.pdf > git commit -m C > git pull -v --squash remote master > > ls > cat <f2.pdf > > set +v > > Replace <very large pdf file, some 100 Mbytes>.pdf with the path of a pdf file > that is really large and run it. > When it executes the git pull it spends on my computer some 30 seconds, > obviously transferring the pdf file, that then it disregards because of the > merge=binary attribute. It does not disregard the file. The working tree is left with your existing version of f2, but note that the index still marks the conflict. Your next step would be to resolve the conflict in some way. Towards that end, you can now inspect both sides: git show :2:f2.pdf ;# our side git show :3:f2.pdf ;# their side Or you can invoke a mergetool to start a third-party merge helper on the binary files: git mergetool Or you can just resolve in favor of "their" side: git checkout --theirs f2.pdf >From your description, I imagine your intent is to simply resolve in favor of the "ours", and never look at the other side. However, git does not have enough information to know that. There is no "merge=ours" attribute (and indeed, it would be kind of crazy, since your result would depend on which direction you were merging, which is something you only know at the time of merge. Hence it makes sense as a command-line option for a strategy, but not something that is an attribute as a file). All that being said, we can construct a case where the contents of the PDF really _don't_ matter at all to the result. Like this: # new repo git init parent cd parent # make a commit with a giant file echo small >foo.txt cp <your-giant-file>.pdf big.pdf git add . git commit -m one # now get rid of the giant file git rm big.pdf git commit -m two # now merge it into another history git init ../child cd ../child echo unrelated >file.txt git add . git commit -m three git pull -v --squash ../parent master Because we are doing a squash merge, we will throw away most of the history we fetch, and only ever look at the tip of parent/master (which in this case does not contain the PDF), and the shared ancestor (which in this case is empty, since there is no shared history). So in theory we could get by with fetching all the commits (to do the history traversal), and the trees and blobs only from the tip commit. But that is not a good idea in general for two reasons: 1. Even if that PDF is not used in the actual merge algorithm, the contents of the earlier commits are useful for figuring out what happened (e.g., when resolving another conflict, you might want to refer back via "git log"). 2. It breaks git's reachability assumptions. Git always makes sure that if you have object X, you have all of the objects it refers to, the ones they refer to, and so on. This assumption underlies many of git's operations (e.g., what we need to send to a remote who claims to have commit X). In this case, since you are using --squash, you could presumably throw away the original history after doing the squash merge. But it would be quite complex to special-case this in the protocol, and almost certainly not worth it for this corner case. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html