Re: mod_ext_filter cmd output is garbage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'll continue to top-post..

I really don't know enough about how mod_proxy handles things in the forward direction, to be able to help more. I see a mod_deflate somewhere in your log, indicating that some compression is taking place, but I don't know if that's before or after your filter comes into play.

Only one thing : the response from the remote server may be text/html (in the Content-type header), but it may also be compressed (as per the Transfer-encoding header). I don't know if mod_proxy, per se, would always decompress it before passing it to, or through, your filter. If not, then your filter may be seeing alternatively uncompressed and compressed html pages; and your example sed filter may just have been "lucky", and happened to run only on uncompressed stuff.

For the charset and encoding, you have to look at the possible "charset" attribute in the Content-type. There also, you may have been lucky : characters in the strict US-ASCII printable range have the same encoding in iso-8859-1 (the default in http) as in UTF-8 (a single byte per character, and the same value indidentally). But if you ever got html pages with these funny accented non-English characters, that would no longer be the case, and your s/foo/bar/ stuff may create a real mess.

And we haven't even started talking about chunked encoding here...

All in all, for this kind of usage, and supposing that all you are trying to do is to add some kind of footer or so to incoming html pages, even for that you would really need to
a) parse the incoming html into some kind of memory structure
b) insert your stuff where appropriate in the structure
c) re-assemble the html before forwarding it to the client
I am not sure that the investment to do that is really worth it for your expected benefit.

By the way, have you looked at :
http://httpd.apache.org/docs/2.2/mod/mod_substitute.html
(but I'm not sure even that one takes charsets into account).
and maybe also
http://httpd.apache.org/docs/2.2/mod/mod_charset_lite.html

I also remember vaguely that there was a module which really allowed to modify html content on the way out, but I don't find it in the list of standard Apache 2.2 modules.


Marcos Mendez wrote:
Yes absolutely. I've setup a forward proxy, where I have to open a
port (8080) for people to use it. I've set the filter type to
text/html. So I guess it's definately an encoding issue. Any way how
to solve that? Strangely enough, the sed filter examples work no
matter what. So I don't understand why this doesn't.

I'm including the log output for the request...

...

[Wed Oct 21 17:25:31 2009] [debug] mod_ext_filter.c(628): [client
172.16.1.199] filtering `http://skyblender.com/' of type `text/html'
through `/etc/apache2/simple.php', cfg ExtFilterOptions DebugLevel=10
NoLogStderr !PreserveContentLength ExtFilterInType text/html
ExtFilterOuttype (unchanged)
...
[Wed Oct 21 17:25:31 2009] [debug] mod_deflate.c(619): [client
172.16.1.199] Zlib: Compressed 531 to 362 : URL http://skyblender.com/
[Wed Oct 21 17:25:31 2009] [debug] mod_proxy_http.c(1807): proxy: end body send
[Wed Oct 21 17:25:31 2009] [debug] proxy_util.c(2009): proxy: HTTP:
has released connection for (*)



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx


[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux