Re: [PATCH][RFC] git on Mac OS and precomposed unicode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Torsten Bögershausen <tboegi@xxxxxx> writes:

> On 08.01.12 03:46, Junio C Hamano wrote:
> ...
>> That also sounds sensible, but...
>> 
>>> This is done in git.c by calling argv_precompose() for all commands
>>> except "git commit".
>> 
>> ... I think it generally is a bad idea to say "all except foo". There may
>> be a reason why "foo" happens to be special in today's code, but who says
>> there won't be another command "bar" that shares the same reason with
>> "foo" to be treated specially? Or depending on the options, perhaps some
>> codepath of "foo" may not want the special casing and want to go through
>> the argv_precompose(), no?
>> 
>> After all, "git commit -- pathspec" will have to get the pathspec from the
>> command line,...
>
> Thanks Junio for catching this.
> I added a new test case as well as fixed the code.

I think you are sidestepping the real issue I raised, which is:

    What is the reason why you do not want to feed the precompose helper
    with some arguments to 'git commit', while it is OK to pass all
    arguments to other commands through precomposition?

I admit it was my fault that I did not spell it out clearly in my
response.

I understand that arguments other than pathspec and revs could be left in
decomposed form, but is there any harm in canonicalizing any and all
command line parameters given in decomposed form consistently into
precomposed form? What problem are you trying to solve by special casing
"git commit"? That is the real question to be answered, as there may be
other commands some of whose arguments may not want to be canonicalized
due to the same reason, but you simply overlooked them. When other people
need to fix that oversight, they need a clearly written criterion what
kind of arguments should not be fixed and why.

And the reason cannot be a desire to pass the value to "--message"
argument intact [*1*]; it is not like osx cannot handle text in
precomposed form, right?

In general, I do not want to see ugly code that says "this one potentially
names a path so we add a call to fix it from decomposed form, but that
other one is not a path and we take it intact" sprinkled all over in the
codebase, without a good reason.

It may seem that one alternative to munging argv[] is to have the
precomposition [*2*] applied inside get_pathspec() and have it take effect
only on the pathspecs, which after all ought to be the only place where
this matters, but I doubt it would result in maintainable code. The names
of branches and tags taken from the command line that are used as revision
names will also be compared with results from readdir in $GIT_DIR/refs/
and need to be canonicalized, for example. So I tend to agree with your
"brute force" approach to canonicalize argv[] before any real part of git
sees them; that is where my suggestion to wrap main() to do so came from.

Also some commands (e.g. "rev-list --stdin") take pathspecs and revs from
their standard input stream, so you would need to be careful about them.


[Footnotes]

*1* Also as other commands like "git merge" also take textual message, and
you do pass the helper to canonicalize it.  No, I am not suggesting you to
special case "git merge".

*2* By the way, this may need a better name if the patch touches anywhere
outside compat/osx --- it is about "canonicalize pathname and pathspec
given from the command line into the form used internally by git", and
from an osx person's point of view, the only difference might be
decomposed vs precomposed, but on other odd systems it might be that
pathnames on the filesystem may be using a different encoding from what is
used for pathnames in the index).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]