On Wed, 2007-06-13 at 12:39 -0500, Richard Lynch wrote: > On Wed, June 13, 2007 10:17 am, Robert Cummings wrote: > > On Wed, 2007-06-13 at 10:36 -0400, Guillaume Theoret wrote: > >> Hi everyone, > >> > >> Can anyone recommend any good books/resources for php and mysql > >> performance management? I'm more interested in the software > >> development side (as opposed to server configuration) of things. I'm > >> looking for anything that would be good to know when working on > >> websites that get millions of hits a day. > >> > >> Also, if anyone knows of any resources/discussions that illustrate > >> the > >> relative performance of joins vs multiple selects I'd love to check > >> it > >> out. > > > > JOIN will almost always be faster by virtue of the query being > > optimized > > and doing the work within a single request. > > In my limited experience, under shared server with seriously > constrained resources, the exact opposite is true... > > Oh, sure, for a SMALL table and an easy JOIN on straight-forward > indexed fields, the JOIN is faster. > > But every time I've ended up with a large table and a JOIN that was > anything remotely interesting (read: complex and un-indexable) the > server just gets swamped. > > And often, a simple straight-forward select to get a handful of rows > followed by another query, or even one query per result row, was MUCH > faster. > > It's entirely possible that I just don't know SQL well enough to get > the schema and indexing right, but I try to index the "obvious" fields > that will be used in the join. Maybe your query is poorly formed. Generally speaking anything that you need to search upon should be indexed provided it has a high enough cardinality. As such a query will usually be optimized to retrieve data using the indexed fields. If the fields aren't indexed then even a second query to retrieve the complimentary data set will still be required to do an exhaustive search without the benefit of indexing. > This also may not even be applicable to a site that scales up with the > kinds of resources you'd throw at that. Hence the reason I mentioned that using multiple selects is more horizontally scalable :) > But I'd certainly recommend writing small simple crude test code with > a dedicated testing server and simulated real life load conditions > rather than just "guessing" at what might be fastest. > > An hour's testing could save you weeks/months of development in this > case. For certain, if you expect such a large load it's in your interest to determine what will be the most resource and cost effective approach. Cheers, Rob. -- .------------------------------------------------------------. | InterJinn Application Framework - http://www.interjinn.com | :------------------------------------------------------------: | An application and templating framework for PHP. Boasting | | a powerful, scalable system for accessing system services | | such as forms, properties, sessions, and caches. InterJinn | | also provides an extremely flexible architecture for | | creating re-usable components quickly and easily. | `------------------------------------------------------------' -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php