On Sep 19, 2008, at 1:12 PM, Stut wrote:
On 19 Sep 2008, at 18:47, Philip Thompson wrote:
I've narrowed it down to 10 initial queries...
1. Grab system config data (that's used in lots of places)
Does it change often? No? Then cache it in a PHP script. Use
var_export to create a file that you can include which will create
the configuration array. Alternatively cache it in a Memcache
instance which is where my system-wide config usually lives.
2. Grab session data (for SESSION array)
Meaning what? You say below that this is after the initial session
load. What are you loading here and why is it being loaded on every
page request if it's ending up in the $_SESSION array?
3. Grab page id
What does this do, how is it used, is it needed?
4. Grab user privs
IMHO you should only grab these when you need them.
5. Grab user session (for application)
Again, why isn't this already in $_SESSION for every page request
expect the first per visit?
6. Begin transaction
7. Lock user session row
8. Update user session
9. Commit transaction
If all you're doing is issuing an update command there is no need to
do so in a transaction and definitely no need to lock the row. An
update is atomic.
Maybe what you actually mean to do here is lock it before you get
the session data, make changes to it and then unlock it once you're
done changing it. Doing that would likely keep the row locked for
the entire duration of a request which can start causing problems as
traffic increases.
10. Add page tracking (an insert-only table that keeps track of
pages you visit)
I handle this using files and then have an offline processor to push
that data into the database. If all you're doing is adding a row to
the table you probably don't need this, but we do a fair amount of
work for each page view to record the data in a set of tables
designed for meaningful and speedy retrieval.
Note that these are the 10 queries that happen after the initial
SESSION load. I supposed I could reduce this by 1 or 2 queries - I
could store the page id/information in the session. Now with that
said, the queries are negligible (in elapsed time) and required.
However, I'm always open up to suggestions/improvements.... =D
You may think they're required, but I'm betting they're not if you
really think about it. However, if your DB can handle it then why
fix something that ain't broken.
The way I approach this stuff is always with the knowledge that the
database is the most expensive resource in the infrastructure, so
anything I can do to avoid using it when it's not strictly necessary
is something I consider well-worth the effort.
With the rise of frameworks and the lazy architectures it's pretty
common to end up with this mass of DB access at the start of each
request, but it won't scale and it leads to assumptions that are
extremely expensive to find and fix when you do need to scale. Trust
me, I've been there many times and it's been painful every time!
Oh, and by scale I don't necessarily mean to tens of millions of
page views a month. Scalability is as much about going from 10
visitor a day to 1000 as it is from 1000 to several million.
-Stut
Robert/Stut,
Thanks for your words of wisdom. ;) I will take what you've said back
to my team for us to discuss. That's why I like this list - allows me
to view the problem(s) from a different angle, or two. ;)
~Philip
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php