Re: git diff does not precompose unicode file paths (OS X)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 04 Mar 2016, at 13:16, Torsten Bögershausen <tboegi@xxxxxx> wrote:
> 
> On 03/04/2016 10:07 AM, Alexander Rinass wrote:
>> Hallo,
>> 
>> It appears that the git diff command does not precompose file path arguments, even if the option core.precomposeunicode is set to true (which is the default on OS X).
>> 
>> Passing the decomposed form of a file path to the git diff command will yield no diff for a modified file.
>> 
>> In my case, the decomposed form of the file path is sent by the OS X Cocoa framework's NSTask, wich I am using in an application. It can be simulated on OS X by using $(iconv -f utf-8 -t utf-8-mac <<< FILE_PATH) as file path argument on the shell.
>> 
>> Git commands like add, log, ls-tree, ls-files, mv, ... accept both file path forms, git diff does not.
>> 
>> It can be tested with the following setup on OS X (as iconv's utf-8-mac encoding is only available on OS X):
>> 
>>     git init .
>>     git config core.quotepath true
>>     git config core.precomposeunicode true # (default on OS X)
>>     touch .gitignore && git add .gitignore && git commit -m "Initial commit"
>>          echo "." >> Ä
>>     git add Ä
>>     git commit -m "Create commit with unicode file path"
>>          echo "." >> Ä
>>     This gives the following status, showing the precomposed form of "Ä":
>> 
>>     git status --short
>>      M "\303\204"
>>     Running git add with both forms does work as expected:
>> 
>>     git add Ä
>>     git status --short
>>     M  "\303\204"
>>          git reset HEAD -- Ä
>>          git add $(iconv -f utf-8 -t utf-8-mac <<< Ä)
>>     git status --short
>>     M  "\303\204"
>>          git reset HEAD -- Ä
>>     However, running git diff only works with the precomposed form:
>> 
>>     git status --short
>>      M "\303\204"
>>          git --no-pager diff -- Ä
>>     [...shows diff...]
>>          git --no-pager diff -- $(iconv -f utf-8 -t utf-8-mac <<< Ä)
>>     [...shows NO diff...]
>> 
>> I took a look at the Git source code, and the builtin/diff*.c do not contain the parse_options call (which does the precompose_argv call) that the other builtins use.
>> 
>> But I am not really familiar with either C or the Git project structure, so this may not mean anything.
>> 
>> Best regards,
>> Alexander Rinass
>> 
> Good analyzes, and thanks for the report.
> It should be possible to stick in a
> 
> precompose_arrgv(argc, argv)
> 
> into builtin/diff.c
> 
> Do you you can test that ?
> 


Sticking a precompose_argv(argc, argv) into diff.c’s cmd_diff function fixes the issue.

But I had to disable the check (precomposed_unicode != 1) in precompose_argv to make it work. That’s probably because precompose_argv is usually called from parse_options and is missing some other call before it?

I think it is clear that diff.c and friends are definitely missing the precomposing step. I am not sure about the right way to fix though (should parse_options be used in the end?) and my C skills are basic at best, otherwise I would create a patch.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]