Thank you for your reply Nathan.
You are right that this method of caching is different than the two
types you have outlined below. I would not say that it is a new
method though, in fact, "pushing static files" to the server is very
common. If it weren't for the fact that this method, as I have
designed it, allows a very tiny PHP overhead to handle dynamic
updating of the cache I could have even gone the extra mile to push
html files that would be loaded directly by the end user without PHP
being initialized at all. (My reasons for not taking this last step
should become apparent to those who read the wishlist I produced at
http://technologies.babywhale.net/cache/ )
Understand that this method does not *exclude* using the other two
methods you have outlined. In fact, I personally make use of
memcached and APC where I feel it is appropriate in my application
design. This does not mean that I can not also write a cache layer
that makes the application itself and its variables irrelevant and
not required for most site hits (hence a major optimization).
To answer your other questions:
1) Caching on disk could easily be handled instead by caching in
memory, but this approach is meant to be ultra-portable and work
everywhere. There are situations where a viable memory storage
mechanism is simply not available, and other cases where it is not
desirable to consume memory for this purpose and plenty of hard drive
storage space is a good alternative. I think you will find this
caching method is intensely speed-tuned and a fast implementation of
a portable file system based method. I would also point out that in
my line of work, where I chiefly have to adopt environments that are
configured under rather political circumstances, it is consistently
this type of caching that the system administrators argue for. As
someone has already pointed out, there may not even be a significant
difference between disk and memory based storage mechanisms on your
server.
2) Again, one of the main theories behind this method is portability.
In order to not rely on cron, server queries, or other external
checks for a stale cache, I have gone with a "refresh interval" which
has been proposed on this list in the past. It proposes that dynamic
content should be refreshed once every X seconds/minutes/hours. This
script avoids PHP date manipulations and instead performs some basic
math to handle the refresh rate, but also to *sync* content to some
degree, so portions of dynamic content are less likely to haphazardly
refresh independently and therefore not match. I think this is a
slight improvement over code that has been posted here before. In a
practical sense, this means that your application fires and produces
content only once every X minutes, and not each and every time the
page is hit. Furthermore, because in this case it is known ahead of
time when that page will expire, a cache header can be sent with an
exact expiration time so repeated hits by the same end user will not
even trigger a transmission of cached content from the server.
3) In regards to daily purging: for one, if you are going for a
scheduled refresh of content, then you probably already have a
refresh rate that is less than 24 hours, so accepting an additional
daily trigger of recaching should not be unacceptable. But more
specifically, the reason behind this is that a file system based
caching method does not natively support a TTL on cached files, and
there has to be some way to handle a cache of a script that has since
been deleted. Note that if 24 hours is not acceptable for some
reason, this script can easily be modified to increase that without
negatively affecting anything else.
On Jun 24, 2007, at 11:55 PM, Nathan Nobbe wrote:
Alexander,
sorry to see nobody has replied to your post, im sure you worked
very hard on the cache system and are eager for feedback..
so to me it looks like youve introduced a somewhat new style of
caching here (though im sure there are other such approaches); for
instance i know of 2 main uses for caches at this time [as caching
pertains to php].
caching php intermediate code
caching application variables
both of these caching techniques are designed to overcome
limitations of the language as it ships out of the box, more or
less; afaik.
it appears you are interested in caching the output of php scripts,
which is, i suppose, a third technique that could be added to the
list.
so i have a criticism about your system and a couple questions as
well.
criticism
why cache script output on disk? if a fast cache is your goal, why
not store the result of script output in memory rather than on
disk; that would be much faster
questions
how does your cache system know when cached output is stale and
allow fresh contents to be delivered from the original script
rather than being served from the cache?
why purge cache contents after 24 hours? im on the memcached
mailing list, and recently they were discussing artificially
resetting the cache; several people said they let memcahe run for
months on end.
-nathan