mod_cgi: multibyte characters in REQUEST_URI can't converted to correct PATH_INFO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Recently I setup Apache-2.2.17 on Windows Server 2003, and config viewvc in CGI 
mode, viewvc works fine except browsing repository entry which contains Chinese 
characters, it will return HTTP 404 when browsing these entryies, I asked in 
viewvc-users mailing list, they said CGI will interact with system using the 
locale is in use by the environment in which it's running( 
http://viewvc.tigris.org/ds/viewMessage.do?dsForumId=4255&dsMessageId=2686631 ).


I tried a small shell CGI script like the following
################################################################################
#!C:\cygwin\bin\bash.exe
# test.sh 
# Environment Variable 
echo Content-type: text/html 
echo 
echo "<html>" 
echo "<head>" 
echo "<title>" 
echo "CGI Environment Variable" 
echo "</title>" 
echo "</head>" 
echo "<body>" 
echo "SERVER_SOFTWARE=$SERVER_SOFTWARE<br/>" 
echo "SERVER_NAME=$SERVER_NAME<br/>" 
echo "SERVER_PROTOCOL=$SERVER_PROTOCOL<br/>" 
echo "SERVER_PORT=$SERVER_PORT<br/>" 
echo "REQUEST_METHOD=$REQUEST_METHOD<br/>" 
echo "GATEWAY_INTERFACE=$GATEWAY_INTERFACE<br/>" 
echo "PATH_INFO=$PATH_INFO<br/>" 
echo "PATH_TRANSLATED=$PATH_TRANSLATED<br/>" 
echo "REMOTE_HOST=$REMOTE_HOST<br/>" 
echo "REMOTE_ADDR=$REMOTE_ADDR<br/>" 
echo "REMOTE_IDENT=$REMOTE_IDENT<br/>" 
echo "SCRIPT_NAME=$SCRIPT_NAME<br/>" 
echo "QUERY_STRING=$QUERY_STRING<br/>" 
echo "CONTENT_TYPE=$CONTENT_TYPE<br/>" 
echo "CONTENT_LENGTH=$CONTENT_LENGTH<br/>" 

echo "<pre>"
/bin/env
echo "</pre>"

echo "</body>" 
echo "</html>" 

exit 0
################################################################################

and tried 2 URLs in different encoding: UTF-8 and GBK.

"äæ" in UTF-8 encoding URL:
http://localhost/cgi-bin/cgi-test.sh/%E4%B8%AD%E6%96%87

"äæ" in GBK encoding URL:
http://localhost/cgi-bin/cgi-test.sh/%D6%D0%CE%C4


The binary value of Chinese characters in the result HTML are not correct.

UTF-8:
src :   E4    B8    AD    E6    96    87
dest:C3 A4 C2 B8 C2 AD C3 A6 C2 96 C2 87 

GBK:
src :    D6    D0    CE    C4
dest:C3  96 C3 90 C3 8E C3 84 


I also try add "SetEnv LC_ALL zh_CN.GBK" or "SetEnv LC_ALL zh_CN.UTF-8" in 
httpd.conf which suggested by viewvc-users mailing list, and even tried add 
windows system environment variable LC_ALL, but it doesn't help, am I missed 
something or mod_cgi does not support multibyte characters in REQUEST_URI?

Thanks!


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
   "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx



[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux