Thanks for the ideas Joshua. I
will look into them!!
Patrick Presto
IT Advisory Specialist
IBM Global Services
720-540-1082
Joshua Slive <jslive@xxxxxxxxx>
04/14/2005 12:06 PM
Please respond to
users
To
users@xxxxxxxxxxxxxxxx
cc
Subject
Re: [users@httpd] Does Module
exists for manipulating html text?
On 4/14/05, Patrick Presto <ppresto@xxxxxxxxxx>
wrote:
>
> I've tried everything at this point and I'm hoping the guru's of the
apache
> list might be able to help!
>
> We have an interesting request from our client who wants us to remove
> comments and _javascript_ from their web pages during a request. Their
> requirement is that the page size be 6K or less. because of
the dynamic
> programs they use to build these pages the size becomes about 25K.
They
> want to dynamically update validation and finally production with
all the
> same content. They want the end users to request the page and
have our web
> servers filter the response and remove all the extra junk from the
page.
>
> To get this to work I used the ext_filter module and a compiled C
program
> that uses regular expressions to remove all the junk.
> ExtFilterDefine replace cmd="ssFormatter"
>
> The problem with this is that the program is called outside of the
Apache
> process and creates a lot of overhead. This site gets about
a million +
> hits a day. Currently we are serving about 800-1000pages/sec
on each web
> server. After loadtesting with this new filter in place we averaged
> 11pages/sec.
>
> Does a module exist today that has the functionality to rewrite
the html
> response?
> Any other ideas out there besides ExtFilterDefine?
First, on a site doing 1000 requests/sec, you're going to see a
serious hit no matter what if you choose to do extensive processing on
every request. You are *much* better off with static, unprocessed
pages, where apache can use sendfile to make things quicker. But
you
are correct that mod_ext_filter is about the slowest way to do this.
So the first thing that I would do is to try to convince the clients
to do the processing in advance, assuming that you are serving static
files.
A second option is to use mod_deflate to shrink the responses. You'll
probably find it is at least as effective as your technique, in terms
of reducing the size of the stuff going over the network, and
certainly much much faster.
Third, you should look at Nick's mod_publisher:
http://apache.webthing.com/mod_publisher/
Last, you can write a custom apache filter to do the job. Look at
mod_case_filter in the experimental module directory of the apache
source for an example.
Joshua.
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
" from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx