Hello,
on 04/06/2005 10:56 PM Cabbar Duzayak said the following:
I am using PHP on Apache/Linux with mod_php4. I need to implement a lazy cache for some resource which should be updated say every 1 hour in a way that the first person who arrives after an hour will be updating the cache. As you can imagine, I need to implement a write lock in this case, the idea is:
1. Retrieve the cached data from database (it is cached because generating it is expensive) 2: If (now() - cache update date) > 1 hour, try to get a lock on some resource 3.a: If lock is acquired, regenerate the cache, update it in the db, unlock the resource, return 3.b: If lock can not be acquired, just display the version retrieved from cache in step 1 and return This will be slightly different in case of mysql (i.e. lock is blocking in mysql), but I guess you get the idea.
In this case, there are two types of locks I can use it looks like:
i) File locking: You can try to lock a file in non-blocking mode ii) Use mysql lock/unlock table to manage the locking mechanism, i.e. create some dummy file.
I use this all the time with PHP under Apache 1.x for caching all sorts of data coming from the database: content, user profiles, sessions, everything.
It works wonders. The speedup is enormous because accessing a file on disk is much faster than executing a SQL query even when the database server caches queries.
flock() is not slow at all. That is a myth. You just should pay attention that you must do everything the same way. There can only be one process locking the file exclusively for writing but there can be many processes accessing the file in shared mode. As long as you are not updating a file literally on every second, the flock contention time is neglectable.
To simplify generic file caching using flock to prevent that concurrent accesses corrupt a cache file being updated, I use this class store arbitrary data in cache files. It assures that the cache files are updated only when there is one process trying write them.
This very same site on which the class is available uses the class to cache everything for several years. Currently it is holding more than 80,000 cache files that occupy over 400 MB.
The good thing about caching practically everything is that you can avoid even establishing database connections once the cache files are upto date. This makes your site handle access surges much better, as the excess of Web server processes that are created does not lead to new database connections.
http://www.phpclasses.org/filecache
To do what you want, just set the expiry time of your cache files to 3600 seconds.
The class also supports preemptive cache invalidations, meaning, it can safely invalidate a cache file in order to force that it needs to be regenerated next time it is accessed.
Maybe you do not need this, but I use that feature all the time to force the cache for a page or something else to be redone after the site updates the database information from which the cached data was taken.
For instance, you are caching the content for an article page. If that article is updated, you invalidate the cache, so next time the article page is accessed the cached content is always refreshed.
I would prefer File Locking since it support non-blocking locks and would definitely be faster than mysql, but I see 2 problems with this:
1. What happens if the php code that locked this file (probably the PHP thread in Apache, if mod4php supports threading) throws an error or dies! Will it be automatically unlocked? Or since I am using mod_php4 and the thread is somehow alive, the lock will be there for a long time?
AFAIK, PHP implicitly closes opened files on exit and so any outstanding file locks are released.
2. The following warning from php manual:
"On some operating systems flock() is implemented at the process level. When using a multithreaded server API like ISAPI you may not be able to rely on flock() to protect files against other PHP scripts running in parallel threads of the same server instance!"
This warning is mute because most people is not running PHP with multi-thread mode servers (read MS IIS or Apache 2). The problem is that several PHP extensions are not reentrant and so they can't run reliably in concurrent threads.
So, since you are most likely using PHP in non-multithreaded server (Apache 1.x or something else using PHP CGI executable) never mind what the manual says about flock().
--
Regards, Manuel Lemos
PHP Classes - Free ready to use OOP components written in PHP http://www.phpclasses.org/
PHP Reviews - Reviews of PHP books and other products http://www.phpclasses.org/reviews/
Metastorage - Data object relational mapping layer generator http://www.meta-language.net/metastorage.html
-- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php