Re: git log and utf-u in filenames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Joey Hess wrote:
> And did earlier versions of git (circa 2006) perhaps
> not do that escaping? I have code in ikiwiki that apparently used to work, but
> is certianly not working with current git, due to this escaping.

No, I guess it's always done that, perhaps something broke on my side
in the meantime.

But it doesn't seem right somehow that gitweb, ikiwiki, and seemingly
any other program that needs to look at git log / commits and figure out
what filename is being changed needs to include their own nasty code[1] to
convert the escaped characters back to normal characters.

And it seems that anyone who uses a lot of utf-8 in filenames would shortly
get tired of git commit, git log, etc displaying obfuscated versions of their
filenames.

I'm sure it makes sense to use this format internally in git to represent
filenames, to avoid needing to worry about encoding issues. But it's a shame
that that internal detail is exposed so that everything around git has to
worry about it.

Would making git-log and git-commit display de-escaped filenames be likely
to break something?

-- 
see shy jo

[1] Such as this from gitweb:

# git may return quoted and escaped filenames
sub unquote {
        my $str = shift;

        sub unq {
                my $seq = shift;
                my %es = ( # character escape codes, aka escape sequences
                        't' => "\t",   # tab            (HT, TAB)
                        'n' => "\n",   # newline        (NL)
                        'r' => "\r",   # return         (CR)
                        'f' => "\f",   # form feed      (FF)
                        'b' => "\b",   # backspace      (BS)
                        'a' => "\a",   # alarm (bell)   (BEL)
                        'e' => "\e",   # escape         (ESC)
                        'v' => "\013", # vertical tab   (VT)
                );

                if ($seq =~ m/^[0-7]{1,3}$/) {
                        # octal char sequence
                        return chr(oct($seq));
                } elsif (exists $es{$seq}) {
                        # C escape sequence, aka character escape code
                        return $es{$seq};
                }
                # quoted ordinary character
                return $seq;
        }

        if ($str =~ m/^"(.*)"$/) {
                # needs unquoting
                $str = $1;
                $str =~ s/\\([^0-7]|[0-7]{1,3})/unq($1)/eg;
        }
        return $str;
}

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux