Thanks Andre. I got Substitute working in no time. :) On Wed, Oct 21, 2009 at 6:32 PM, André Warnier <aw@xxxxxxxxxx> wrote: > I'll continue to top-post.. > > I really don't know enough about how mod_proxy handles things in the forward > direction, to be able to help more. I see a mod_deflate somewhere in your > log, indicating that some compression is taking place, but I don't know if > that's before or after your filter comes into play. > > Only one thing : the response from the remote server may be text/html (in > the Content-type header), but it may also be compressed (as per the > Transfer-encoding header). I don't know if mod_proxy, per se, would always > decompress it before passing it to, or through, your filter. > If not, then your filter may be seeing alternatively uncompressed and > compressed html pages; and your example sed filter may just have been > "lucky", and happened to run only on uncompressed stuff. > > For the charset and encoding, you have to look at the possible "charset" > attribute in the Content-type. There also, you may have been lucky : > characters in the strict US-ASCII printable range have the same encoding in > iso-8859-1 (the default in http) as in UTF-8 (a single byte per character, > and the same value indidentally). But if you ever got html pages with these > funny accented non-English characters, that would no longer be the case, and > your s/foo/bar/ stuff may create a real mess. > > And we haven't even started talking about chunked encoding here... > > All in all, for this kind of usage, and supposing that all you are trying to > do is to add some kind of footer or so to incoming html pages, even for that > you would really need to > a) parse the incoming html into some kind of memory structure > b) insert your stuff where appropriate in the structure > c) re-assemble the html before forwarding it to the client > I am not sure that the investment to do that is really worth it for your > expected benefit. > > By the way, have you looked at : > http://httpd.apache.org/docs/2.2/mod/mod_substitute.html > (but I'm not sure even that one takes charsets into account). > and maybe also > http://httpd.apache.org/docs/2.2/mod/mod_charset_lite.html > > I also remember vaguely that there was a module which really allowed to > modify html content on the way out, but I don't find it in the list of > standard Apache 2.2 modules. > > > Marcos Mendez wrote: >> >> Yes absolutely. I've setup a forward proxy, where I have to open a >> port (8080) for people to use it. I've set the filter type to >> text/html. So I guess it's definately an encoding issue. Any way how >> to solve that? Strangely enough, the sed filter examples work no >> matter what. So I don't understand why this doesn't. >> >> I'm including the log output for the request... >> > ... > >> [Wed Oct 21 17:25:31 2009] [debug] mod_ext_filter.c(628): [client >> 172.16.1.199] filtering `http://skyblender.com/' of type `text/html' >> through `/etc/apache2/simple.php', cfg ExtFilterOptions DebugLevel=10 >> NoLogStderr !PreserveContentLength ExtFilterInType text/html >> ExtFilterOuttype (unchanged) > > ... >> >> [Wed Oct 21 17:25:31 2009] [debug] mod_deflate.c(619): [client >> 172.16.1.199] Zlib: Compressed 531 to 362 : URL http://skyblender.com/ >> [Wed Oct 21 17:25:31 2009] [debug] mod_proxy_http.c(1807): proxy: end body >> send >> [Wed Oct 21 17:25:31 2009] [debug] proxy_util.c(2009): proxy: HTTP: >> has released connection for (*) >> > > > --------------------------------------------------------------------- > The official User-To-User support forum of the Apache HTTP Server Project. > See <URL:http://httpd.apache.org/userslist.html> for more info. > To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx > " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx > For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx > > --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx