Vizion <vizion@xxxxxxxxxxxxxxxxxxxx> writes: > On Thursday 02 June 2005 05:15, the author Fredrik Steen contributed to the > dialogue on- > [users@httpd] Re: Apache 2.x remote logging: > >>Vizion <vizion@xxxxxxxxxxxxxxxxxxxx> writes: >>> On Thursday 02 June 2005 02:13, the author Fredrik Steen contributed to >>> the dialogue on- >>> >>> [users@httpd] Apache 2.x remote logging: >>>>I'm in the process of building a web farm consisting of 10-20-30 something >>>> web servers handling 300k of virtual hosts and have started to look at >>>> how to handle logging for these hosts. >>>> >>>>Do anyone have any good ideas how to handle that massive amount of log >>>> files? Preferably using some sort of remote log collector. >>>> >>>>I have looked at: >>>> - mod-witch: http://savannah.nongnu.org/projects/mod-witch (Will syslog >>>> handle it?) - mod_log_spread: http://www.backhand.org/mod_log_spread/ >>>> (Apache 2.x module status?) >>>> - mod_log_sql: http://www.outoforder.cc/projects/apache/mod_log_sql/ >>>> (Performance?) >>>> >>>>Any recommendations or ideas to handle this? >>> >>> With such a massive logging requirement I feel some analysis of the >>> purpose to you intend to fulfill by using the logs could well drive the >>> selection of methods & processes for dealing with them and hence and your >>> choice of solution. >>> Are there any top level plans which describes each purpose and some idea >>> of data volumes applicable for each purpose you feel able to share with >>> us? >> >>The idea is to use the logs for generating statistics (daily) for each >> virtual host. > In that case the virtual host http access data could remain on each machine > and you would have no need to use use a central log collector for them. All the virtual hosts is shared between all of the web servers and then load balanced so every web server will have log file(s) for every virtual host (I should of course have informed that in my previous mail, sorry) The most straight forward thing to do would be to collect the log file(s) from the web servers (split), merge, sort and generate statistics for each. When thinking of it I assumed that this was a common problem for larger setups and that there was some nice solutions for it. But searching mailing lists and the web it seems that it was not the case (at least for Apache 2.x). I appreciate your help. > That seems the simple approach. You might want to consider automatically > generationg a summary report to each user every week with a link to the raw > data and tell them it wll be held for X days and thereafter deleted. that > way you you get remve archival demands and provide an appreciated > service. If the summary dile is designed so that it incorporates all the > stats you need then a copy of that email you be cc'd to a system which > amagamtes the data and produces your own summary statistics. If that is > workable for you it would mean you do not need to add any more complexity to > your infrastructure. My two pennorth > >> We will use webalizer[1] and/or awstats[2] for presentation. >> The logs will be saved for X months and then deleted. That's the plan. Thanks. -- .Fredrik Steen --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx