On 22/09/11 23:23, Javier Amor garcia wrote:
Hello,
I am working in a access.log parser for squid and I have trouble with
some URLs that contains no-us characters, like spanish accents.
To fix the issues with the parser I need to know the following:
The character encoding used for the log files is always the same or is
system dependent?.
Neither. It is configuration dependent.
see http://www.squid-cache.org/Doc/config/logformat/
ie
" output in quoted string format
[ output in squid text log format as used by log_mime_hdrs
# output in URL quoted format
' output as-is
- left aligned
The default for URI fields should be URL-encoding according to the URI
specifications. Which means RFC 1738 encoding of all non-ASCII
characters in the path & query sections. puny-coding of characters in
the host authority section (although the puny-coding is done by the
browser, Squid is agnostic).
There is some way to explicitly force squid to use a given charset (or
UTF8) in its log files?.
All Squid log files are UTF-8. Some specific characters are URL-encoded
to enforce one-line log entries. Otherwise not.
Amos
--
Please be using
Current Stable Squid 2.7.STABLE9 or 3.1.15
Beta testers wanted for 3.2.0.12