Andy Parkins <andyparkins@xxxxxxxxx> writes: > No parsing of the keyword itself is performed, the content is simply > dropped. You are sidestepping the most important problem by doing this. The only sensible keyword you could have, without destroying what git is, is blob id. No commit id, no date, no author. In http://article.gmane.org/gmane.comp.version-control.git/44654, Linus said: I'll finish off trying to explain the problem in fundamental git terms: say you have a repository with two branches, A and B, and different history on a file "xyzzy" in those two branches, but because they both ended up applying the same patches, the actual file contents do end up being 100% identical. So they have the same SHA1. What is git diff A..B -- xyzzy supposed to print? And *I* claim that if you don't get an immediate and empty diff, your system is TOTALLY BROKEN. Another thing he could have said is this: When you have such two branches, A and B, and you are on branch A: $ git checkout B should be immediate and instantaneous. If you try to keyword expand commit id, date or anything that is sensitive to *how* you got there, even though A and B have the exact same set of blobs, you have to essentially update all of them. Computing what to expand to takes (perhaps prohibitively expensive) time, but more importantly rewriting the whole 20k (or howmanyever you have in your project) files out becomes necessary, if your keyword expansion wants to say "oh, this file was taken from a checkout of branch B", for obvious reasons. Keyword expanding blob-id, or munging line-endings to CRLF form on platforms that want it, do not have this problem, as how you reached to the blob content does not affect the result of expansion, therefore not just the blobs in commit A and commit B but the working tree checked out of them must match with each other. Having reiterated what Linus already said why keyword expansion and git are not friendly with each other (perhaps the reason is because the former is stupid and git is smart), I'd try to be a bit constructive and point out the areas you _could_ help with in the nearby codepaths: * When 'diff' borrows from the working tree because the filesystem data matches the blob we are interested in, we already have a call to convert_to_git(). The diff machinery operates on the canonicalized representation (i.e. this is an area we do not need help from you). * When 'checkout', 'read-tree -u' and 'merge-recursive' write things, we already have calls to convert_to_working_tree() to munge blob representation to working tree representation (i.e. again, this is an area we do not need help from you). * We do not do the borrowing from working tree when doing grep_sha1(), but when we grep inside a file from working tree with grep_file(), we do not currently make it go through convert_to_git() to fix line endings. Maybe we should, if only for consistency. * We do not currently run convert_to_git() on the patch text given to git-apply; we could do so in parse_single_patch(). - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html