Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 15, 2014 at 7:08 AM, Michael Wagner <accounts@xxxxxxxxxxx> wrote:
> On Thu, May 15, 2014 at 12:25:45AM +0200, Jakub Narębski wrote:
>> On Wed, May 14, 2014 at 11:57 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>>> Michael Wagner <accounts@xxxxxxxxxxx> writes:
>>>
>>>> Perl has an internal encoding used to store text strings. Currently, trying to
>>>> view files with UTF-8 encoded names results in an error (either "404 - Cannot
>>>> find file" [blob_plain] or "XML Parsing Error" [blob]). Converting these UTF-8
>>>> encoded file names into Perl's internal format resolves these errors.
>>
>> Could you give us an example?  What is important is whether filename
>> is passed via path_info or via query string.
>>
>
> There is a file named "Gütekriterien.txt" in my repository. Trying to
> view this file as "blob_plain" produces an 404 error (displaying the
> file name with an additional print statement):
>
> $ REQUEST_METHOD=GET QUERY_STRING='p=notes.git;a=blob_plain;f=work/G%C3%83%C2%BCtekriterien.txt;hb=HEAD' ./gitweb.cgi
>
> work/Gütekriterien.txt
> Status: 404 Not Found

You have URI encoding of "ü" wrong! "ü" encodes as %C3%BC, not
as %C3%83%C2%BC (4 bytes?)

  http://www.url-encode-decode.com/

You tested with wrong input.

BTW. there probably should be test for UTF-8 encoding, similar to
the one for XSS in t9502-gitweb-standalone-parse-output
-- 
Jakub Narębski
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]