Re: [PATCH 2/2] Add keyword unexpansion support to convert.c

Andy Parkins <andyparkins@xxxxxxxxx> · Tue, 17 Apr 2007 12:35:33 +0100

On Tuesday 2007 April 17 11:09, Junio C Hamano wrote:

> In http://article.gmane.org/gmane.comp.version-control.git/44654,
> Linus said:
>     And *I* claim that if you don't get an immediate and empty diff, your
>     system is TOTALLY BROKEN.

Well that one is easy - the file is normalised to contain collapsed keywords 
upon checkin, so diff works the same as it ever did.  The output would be 
immediate and empty so is not TOTALLY BROKEN.

> 	$ git checkout B
>
> 	should be immediate and instantaneous.

Now - that's a much better argument.  However, it's not relevant, keywords (in 
other VCSs, and so why not in git) are only updated when a file is checked 
out.  There is no need to touch every file.  It's actually beneficial, 
because the keyword in the file is the state of the file at the time it was 
checked in - which is actually more useful than updating it to the latest 
commit every time.

That means you're only ever expanding in a file that your changing anyway - so 
it's effectively free.  git-checkout would still be immediate and 
instantaneous.

> If you try to keyword expand commit id, date or anything that is
> sensitive to *how* you got there, even though A and B have the
> exact same set of blobs, you have to essentially update all of
> them.  Computing what to expand to takes (perhaps prohibitively
> expensive) time, but more importantly rewriting the whole 20k
> (or howmanyever you have in your project) files out becomes
> necessary, if your keyword expansion wants to say "oh, this file
> was taken from a checkout of branch B", for obvious reasons.

Ignoring the fact that expansion is only when a file is checked out; I'd argue 
that it's your own fault if you enable keyword expansion on twenty thousand 
files.  A lot of the discussion has been about how useless keyword expansion 
is in almost every case.  I only want it for a few files in my repository; so 
am willing to pay the small computing cost.  Obviously keywords would be 
disabled by default - in which case, you get what you deserve if you enable 
them on everything.

Putting my own selfish requirements aside, from a purely "mine is better than 
yours" point of view, git can't do something that CVS (in all it's 
horridness) can.  It's distinctly off-putting to people when they 
say "keyword expansion", that the response is "YOU'RE AN IDIOT - GO AWAY - 
YOU DON'T DESERVE TO USE GIT"; and back they'll scurry to CVS/subversion.

> Keyword expanding blob-id, or munging line-endings to CRLF form
> on platforms that want it, do not have this problem, as how you
> reached to the blob content does not affect the result of
> expansion, therefore not just the blobs in commit A and commit B
> but the working tree checked out of them must match with each
> other.

That's true - however, even if the only keyword git supports is $BlobID$, that 
would address a large proportion of people's needs.  As I said above though, 
the keywords are only expanded on checkout (and checkin to be consistent).

> Having reiterated what Linus already said why keyword expansion
> and git are not friendly with each other (perhaps the reason is
> because the former is stupid and git is smart), I'd try to be a

(This is were my "YOU'RE AN IDIOT - YOU CAN'T USE GIT" alarm goes off).  Git 
is better than CVS/subversion in every respect - save this one.  It's almost 
completely free to do (apart from the initial coding of it of course) because 
of these two factors:
 - The keywords are collapsed in the repository
 - The keywords are only expanded on checkout
It doesn't fundamentally alter anything that git does right now.

>  * We do not do the borrowing from working tree when doing
>    grep_sha1(), but when we grep inside a file from working tree
>    with grep_file(), we do not currently make it go through
>    convert_to_git() to fix line endings.  Maybe we should, if
>    only for consistency.

I'd actually argue not - git-grep searches the working tree.  The expanded 
keywords are in the working tree.  Take the CRLF case - I'm a clueless user, 
who only understands the system I'm working on.  I want to search for all the 
line endings, so I do git-grep "\r\n" - that should work, because I'm 
searching my working tree.

>  * We do not currently run convert_to_git() on the patch text
>    given to git-apply; we could do so in parse_single_patch().

Yep - definitely; the applied patch should certainly be normalised before 
application.  I'd have to add it if I wanted keywords anyway wouldn't I?

Andy
-- 
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@xxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html