Jonathan Zuckerman wrote:
On Tue, Jul 28, 2009 at 3:37 AM, S.A.<qmt9z3@xxxxxxxxx> wrote:
...Concurring with Jonathan about the free advice and the tenuous relevance to the main list topic, I'd nevertheless want to try to contribute.
My summary of the issue : - there are N clients accessing the site - each client is authenticated, with a client-id of some kind - they all request originally the same URL- the server however returns a page to each client that can be different, based on a server-side client profile, selected as per the client-id - the returned page is different, because it includes for each client, a different mixture of "items" in the page, based on the client profile - each client gets a different selection of i items, but these i items are picked among a grand total of I items, which are themselves always the same - you would like to cache at least part of these I items in memory, to speed up the responses to the clients
You haven't given us any hard numbers, like how many clients there are, how concurrently they access the server, how many I items there really are, how large each I item is, how fast the server is, how much memory it has, or anything of the kind. You have mentioned that some of the items I were "media", which I personally tend to associate with "large", byte-wise.
My very first reaction would be to ask myself if it is all really worth it. Caching in memory, no matter how it's done, has a cost. A cost in design, complexity, and in pure cache management. Modern operating systems already cache disk data. So if a same "object" is accessed frequently in a short period of time, it will already be in the practice cached in memory buffers by the OS. Below the OS level, good disk controllers also cache frequently accessed data. Below the controllers, disks themselves cache data in cache memory. Caching it yet again, with a different piece of software, may just add overhead.
An additional aspect is that, if some of the objects are large, and your server has limited memory, caching many such objects may fill up the physical memory, and cause the system to start swapping, which would really have the opposite effect to what you're looking for.
On the other hand, for Apache to access an object on disk, requires on the part of Apache quite a bit of work; all the more work the deeper the object resides in the "document space", because Apache needs to "walk" the directory hierarchy, all the while checking access and other rules at each level. So by organising your objects smartly on disk, so as to minimise the work Apache has to do to find it and return it, you may gain a whole lot of processing time.
And servers nowadays are cheap. For the time and money you'd spend studying the best caching scheme, you could easily buy an extra server with terabytes of disk space and gigabytes of ram to use as I/O cache.
So basically what I am saying, is : try it, without any clever caching scheme, but with a clever organisation of your data and an efficient Apache configuration. That /may/ show a problem and a bottleneck, which you can then tackle on its own merits. On the other hand, it may show no problem at all.
A lot of work has gone into Apache, to make it as efficient as possible to serve content of all kinds. There are thousands of Apache sites handling thousands of clients, and a lot of content. Do not spend a lot of time ahead of time, to solve what is maybe a non-existent problem. As someone said a long time ago : premature optimisation is the source of much evil.
--------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx