Re: [bug] git-check-ignore and file names with unicode chars in name - sys-out filename is corrupted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 09, 2016 at 01:47:18AM -0400, Paul Hammant wrote:

> Reproduction:
> 
>   $ echo "*.ignoreme" >> .gitignore
>   # (and commit)
>   $ touch "fooo-€.ignoreme"
>   $ find . -print | grep fooo | xargs git check-ignore
>   "./fooo-\342\202\254.ignoreme"
> 
> You could view that git-check-ignore isn't corrupting anything, it is
> just outputting another form for the file name (octal escaped), but it
> doesn't need to change it at all, and its causing downstream problems
> in bash scripting.

It's not corrupted; like all git commands, check-ignore by default
prints paths with a reversible quoting mechanism, so that odd filenames
are not syntactically ambiguous (e.g., consider a filename with a
newline in it), and so that you don't get binary spew on your terminal.

For robust scripting, you can either:

  - unquote the filenames in the receiving script (detect the presence
    of quoting by the double-quote in the first character, and then
    normal C-style dequoting).

or

  - use "-z" to get NUL-delimited filenames with no quoting. Your
    example above has problems in the find, grep, and xargs
    commands, too. A more careful version is:

      find . -print0 | grep -z fooo | git check-ignore --stdin -z

For human readability, you can do:

  git config core.quotepath false

to avoid quoting binary characters (here and in other tools like "git
diff"), which is convenient if you use UTF8 filenames. It also will
"unbreak" your scripts in the sense that it will avoid quoting in more
situations. The scripts would still choke on more weird filenames
(e.g., ones with embedded tabs or newlines), but in practice you'd
probably never notice.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]