[Question] Is it normal for accented characters to be shown as decomposed Unicode on GNU/Linux?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everybody,

I have a repository where some files and folders contain accented
characters due to being in French. Such names include "rêve" (dream),
"réunion" (meeting) etc.

Whether already in version control or not, git tools only show their
*decomposed* representation (I use a UTF-8 locale, see below), but don't
accept those representations as input (and auto-completion is broken for
those), which is a bit misleading (test case follows).

I've seen the threads about accented characters on OSX and the use of
'core.precomposeunicode', but as I'm running on GNU/Linux I thought this
shouldn't apply.

Since I've already had a problem in git with a weirdly encoded character
(see http://thread.gmane.org/gmane.comp.version-control.git/269710), I
wanted to get some feedback to determine whether my setup was the cause
of it or if it was normal to see decomposed file names in git. I found
in man git-status:

> If a filename contains whitespace or other nonprintable
> characters, that field will be quoted in the manner of a C string
> literal: surrounded by ASCII double quote (34) characters, and with
> interior special characters backslash-escaped.

So do everybody using accented characters see those in decomposed form
in git? And if so why some softwares built on top of it (like gitit [1])
don't inherit those decomposed representations?

[1] http://gitit.net/

Thanks!

---
test case:
$ mkdir accent-test && cd !$
$ git init
$ touch rêve réunion
$ git status
On branch master

Initial commit

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	"r\303\251union"
	"r\303\252ve"
$ git add .
$ git commit -m "accent test"
[master (root commit) 0d776b7] accent test
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 "r\303\251union"
 create mode 100644 "r\303\252ve"
$ git log --summary
commit 0d776b7a09d5384a76066999431507018e292efe
Author: Bastien Traverse <bastien@traverse.email>
Date:   2015-06-22 14:13:46 +0200

    accent test

 create mode 100644 "r\303\251union"
 create mode 100644 "r\303\252ve"
$ mv rêve reve
$ git status
On branch master
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	deleted:    "r\303\252ve"

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	reve

no changes added to commit (use "git add" and/or "git commit -a")
$ git add [[TAB-TAB]]
"r\303\252ve"  reve
$ git add "[[TAB]] --> git add "\"r\\303\\252ve\""
fatal: pathspec '"r\303\252ve"' did not match any files
$ git add "r\303\252ve"
fatal: pathspec 'r\303\252ve' did not match any files
$ git add rêve reve OR git add .
$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	renamed:    "r\303\252ve" -> reve

I'm running an up-to-date Arch linux with following software versions
and locale config:

$ uname -a
Linux xxx 4.0.5-1-ARCH #1 SMP PREEMPT Sat Jun 6 18:37:49 CEST 2015
x86_64 GNU/Linux
$ bash --version
GNU bash, version 4.3.39(1)-release (x86_64-unknown-linux-gnu)
$ git --version
git version 2.4.3
$ locale
LANG=fr_FR.utf8
LC_CTYPE="fr_FR.utf8"
LC_NUMERIC=fr_FR.utf8
LC_TIME=fr_FR.utf8
LC_COLLATE="fr_FR.utf8"
LC_MONETARY=fr_FR.utf8
LC_MESSAGES="fr_FR.utf8"
LC_PAPER=fr_FR.utf8
LC_NAME="fr_FR.utf8"
LC_ADDRESS="fr_FR.utf8"
LC_TELEPHONE="fr_FR.utf8"
LC_MEASUREMENT=fr_FR.utf8
LC_IDENTIFICATION="fr_FR.utf8"
LC_ALL=
$ localectl
   System Locale: LANG=fr_FR.UTF8
       VC Keymap: fr
      X11 Layout: fr
     X11 Variant: oss

Cheers
--
To unsubscribe from this list: send the line "unsubscribe git" in



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]