Re: [users@httpd] Does Module exists for manipulating html text?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/14/05, Patrick Presto <ppresto@xxxxxxxxxx> wrote:
>  
> I've tried everything at this point and I'm hoping the guru's of the apache
> list might be able to help!   
>  
> We have an interesting request from our client who wants us to remove
> comments and java script from their web pages during a request.  Their
> requirement is that the page size be 6K or less.  because of the dynamic
> programs they use to build these pages the size becomes about 25K.  They
> want to dynamically update validation and finally production with all the
> same content.  They want the end users to request the page and have our web
> servers filter the response and remove all the extra junk from the page.   
>  
> To get this to work I used the ext_filter module and a compiled C program
> that uses regular expressions to remove all the junk. 
> ExtFilterDefine  replace cmd="ssFormatter" 
>  
> The problem with this is that the program is called outside of the Apache
> process and creates a lot of overhead.  This site gets about a million +
> hits a day.  Currently we are serving about 800-1000pages/sec on each web
> server.  After loadtesting with this new filter in place we averaged
> 11pages/sec.   
>  
> Does a  module exist today that has the functionality to rewrite the html
> response? 
> Any other ideas out there besides ExtFilterDefine? 

First, on a site doing 1000 requests/sec, you're going to see a
serious hit no matter what if you choose to do extensive processing on
every request.  You are *much* better off with static, unprocessed
pages, where apache can use sendfile to make things quicker.  But you
are correct that mod_ext_filter is about the slowest way to do this.

So the first thing that I would do is to try to convince the clients
to do the processing in advance, assuming that you are serving static
files.

A second option is to use mod_deflate to shrink the responses.  You'll
probably find it is at least as effective as your technique, in terms
of reducing the size of the stuff going over the network, and
certainly much much faster.

Third, you should look at Nick's mod_publisher:
http://apache.webthing.com/mod_publisher/

Last, you can write a custom apache filter to do the job.  Look at
mod_case_filter in the experimental module directory of the apache
source for an example.

Joshua.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
   "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx



[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux