Johannes Sixt <j.sixt@xxxxxxxxxxxxx> writes: > Is there already an established definition which the "correct" > .gitattributes are? IIRC, everywhere else we are looking at the > .gitattributes in the worktree, regardless of whether the object at the > path in question is in the worktree, the index, or in an old commit. No, and it is deliberately kept vague while waiting for us to come up with a clear definition of what is "correct". We could declare, from a purist's point of view, that the attribute should be taken from the same place as the path in question is taken from. When running "git add foo.c", we grab the contents of "foo.c" from the working tree, so ".gitignore" from the working tree should be applied when dealing with "foo.c". Similarly, the contents of blob "foo.c" that "git checkout foo.c" reads from the index would get attributes from ".gitignore" in the index (to find what its smudging semantics is) before it gets written out to the working tree. "git diff A B" may give the attributes from tree A to the preimage side while using the attributes from tree B to the postimage side. But the last example has some practical issues. Very often, people retroactively define attributes to correct earlier mistakes. If an older tree A forgot to declare that a path mybank.gnucash is a GnuCash ledger file, while a newer tree B (and the current checkout that is even newer) does [*1*], it is more useful to apply the newer definition from .gitattributes to both trees in practice (and in practice, you are much less likely to have a check-out of ancient tree while running "git diff A B" to compare two trees that are newer than the current check-out). Using the file from the working tree is the best approximation of "we want to use the newer one", both from the semantics (i.e. you are likely to have fresher tree checked out) and the performance (i.e. reading from files in the working tree is far more trivial than reading from historical trees) point of view. So it is not so cut-and-dried that "take the attributes from the same place" is a good and "correct" definition [*2*]. [Footnote] *1* GnuCash writes, by default, a gzip compressed xml file, so I have in my .gitattributes file *.gnucash filter=gnucash and then in my .git/config [filter "gnucash"] clean = gzip -dc smudge = gzip -c This allows "git diff" to work reasonably well (if you do not mind reading diff between two versions of xml files, that is) and also helps delta compression when packing the repository. *2* Besides, the attributes are primarily used to define the semantics about the contents in question. If one file is of "gnucash" kind (i.e. has "filter=gnucash" attribute in the previous example) in one tree, and the path is of a different kind (e.g. "filter=ooo" that says "this is an Ooo file"), it is very likely that it does not even make sense, with or without content filtering, to compare them with "git diff", so "take the attributes from the same place" would have to imply "if the attributes do not match, say something similar to 'Binary files differ'", which is just as useless as applying one attribute taken from a convenient but random place (i.e. the working tree). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html