Re: [PATCH] rebase -p: avoid grep on potentailly non-ASCII data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09.03.16 21:26, Junio C Hamano wrote:
> Anders Kaseorg <andersk@xxxxxxx> writes:
[]
>  sane_grep () {
> -	GREP_OPTIONS= LC_ALL=C grep "$@"
> +	GREP_OPTIONS= LC_ALL=C grep @@SANE_TEXT_GREP@@ "$@"
>  }
>  
>  sane_egrep () {
> -	GREP_OPTIONS= LC_ALL=C egrep "$@"
> +	GREP_OPTIONS= LC_ALL=C egrep @@SANE_TEXT_GREP@@ "$@"
>  }
>

I always wondered why we do LC_ALL=C.
Isn't that begging for trouble, when we feed UTF-8, ISO-8895-1
or other stuff into a program and say LC_ALL=C at the same time ?

On my Debian Linux system I have
LANG=en_US.UTF-8

and

$ locale -a
C
C.UTF-8
en_US.utf8
POSIX
--------------

Mac OS has LANG unset, and reports
locale -a
en_US
en_US.ISO8859-1
en_US.ISO8859-15
en_US.US-ASCII
en_US.UTF-8
#(and a lot more )
C
POSIX

-----
My Centos has 
LANG=en_US.UTF-8

and reports e.g.
en_US
en_US.iso88591
en_US.iso885915
en_US.utf8
(And many more)

In t0204 we have
    LANGUAGE=is LC_ALL="$is_IS_locale" git init repo >actual &&
which is based on
	# is_IS.UTF-8 on Solaris and FreeBSD, is_IS.utf8 on Debian
	is_IS_locale=$(locale -a 2>/dev/null |
in 
lib-gettext.sh

Is there something we can steal here ?


http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]