On 6/13/07, Richard Lynch <ceo@xxxxxxxxx> wrote:
On Wed, June 13, 2007 12:31 pm, Guillaume Theoret wrote: > Thanks for the response. > > On 6/13/07, Robert Cummings <robert@xxxxxxxxxxxxx> wrote: >> On Wed, 2007-06-13 at 10:36 -0400, Guillaume Theoret wrote: >> > Hi everyone, >> > >> > Can anyone recommend any good books/resources for php and mysql >> > performance management? I'm more interested in the software >> > development side (as opposed to server configuration) of things. >> I'm >> > looking for anything that would be good to know when working on >> > websites that get millions of hits a day. >> > >> > Also, if anyone knows of any resources/discussions that illustrate >> the >> > relative performance of joins vs multiple selects I'd love to >> check it >> > out. >> >> JOIN will almost always be faster by virtue of the query being >> optimized >> and doing the work within a single request. > > Really? I thought the way it worked was that when you joined 2 tables > it needed to create every row combination applicable and then apply > the where clause. In large tables wouldn't this be slower? It's these > kinds of optimizations and when the kick in, etc that I don't know > much about. Conceptually, JOIN builds that monster table. If the DB engine can figure out how to constrain one table or another BEFORE that JOIN to give a much smaller record set, and if they have mathematical proof that the end result is the same, then they will optimize and go with the smaller set when possible. That's a (very) good thing.
Yes, very good indeed. I don't envy the people that write the algorithms to figure that stuff out. I do still remember my relational algebra but wouldn't know how to go about implementing that as efficiently as possible!
> In our application we wrote an abstraction layer with lazy loading. > (eg: If a User has a Profile the db users table has a profile_id and > we create a ProxyProfile that only has an id and will look up its > other attributes in the db if ever needed and then replace its > reference by a full Profile object.) Because of this, so far the > entire app only has 1 join because the other select(s) will only be > done if and when they're needed. I'm certain this is faster in the > average case but I wanted to know which is generally faster in case I > later profile the code and see that in some cases the dependent item > is pretty much always loaded. You really should write the code the most straight-forward way you can, and then optimize after identifying bottle-necks. Anything else is just optimization-masturbation.
We wrote this specifically to keep things conceptually simple. It's much easier to write an ORM layer and then use nothing but objects without worrying about the db in the application layer. The result of writing the ORM layer the way we did was that joins pretty much vanished (because of the lazy loading). I was just curious as to how good this actually was. It also makes it much easier to profile and make changes if necessary too.
> The db will be under heavy load (once we deploy) but we don't yet > intend on distributing the database. We did however plan for it since > in the scenario I described above we just need to create a different > db connection for a different table. We could theoretically have as > many different db servers as tables (except for that one join of 2 > tables). This is the scary part. You really ought to set up a QA server with simulated heavy load for real life testing, rather than waiting until you deploy to experience heavy load.
It's not as scary as it sounds. What I'm working on is pretty much a re-write of our current system (with many new features) and it runs off a single db server. We do expect the load to increase over time though (we currently get around 500 to 700 000 hits a day I think) so it's good to plan for growth.
-- Some people have a "gift" link here. Know what I want? I want you to buy a CD from some indie artist. http://cdbaby.com/browse/from/lynch Yeah, I get a buck. So?
-- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php