On Wed, June 13, 2007 12:31 pm, Guillaume Theoret wrote: > Thanks for the response. > > On 6/13/07, Robert Cummings <robert@xxxxxxxxxxxxx> wrote: >> On Wed, 2007-06-13 at 10:36 -0400, Guillaume Theoret wrote: >> > Hi everyone, >> > >> > Can anyone recommend any good books/resources for php and mysql >> > performance management? I'm more interested in the software >> > development side (as opposed to server configuration) of things. >> I'm >> > looking for anything that would be good to know when working on >> > websites that get millions of hits a day. >> > >> > Also, if anyone knows of any resources/discussions that illustrate >> the >> > relative performance of joins vs multiple selects I'd love to >> check it >> > out. >> >> JOIN will almost always be faster by virtue of the query being >> optimized >> and doing the work within a single request. > > Really? I thought the way it worked was that when you joined 2 tables > it needed to create every row combination applicable and then apply > the where clause. In large tables wouldn't this be slower? It's these > kinds of optimizations and when the kick in, etc that I don't know > much about. Conceptually, JOIN builds that monster table. If the DB engine can figure out how to constrain one table or another BEFORE that JOIN to give a much smaller record set, and if they have mathematical proof that the end result is the same, then they will optimize and go with the smaller set when possible. That's a (very) good thing. > In our application we wrote an abstraction layer with lazy loading. > (eg: If a User has a Profile the db users table has a profile_id and > we create a ProxyProfile that only has an id and will look up its > other attributes in the db if ever needed and then replace its > reference by a full Profile object.) Because of this, so far the > entire app only has 1 join because the other select(s) will only be > done if and when they're needed. I'm certain this is faster in the > average case but I wanted to know which is generally faster in case I > later profile the code and see that in some cases the dependent item > is pretty much always loaded. You really should write the code the most straight-forward way you can, and then optimize after identifying bottle-necks. Anything else is just optimization-masturbation. > The db will be under heavy load (once we deploy) but we don't yet > intend on distributing the database. We did however plan for it since > in the scenario I described above we just need to create a different > db connection for a different table. We could theoretically have as > many different db servers as tables (except for that one join of 2 > tables). This is the scary part. You really ought to set up a QA server with simulated heavy load for real life testing, rather than waiting until you deploy to experience heavy load. -- Some people have a "gift" link here. Know what I want? I want you to buy a CD from some indie artist. http://cdbaby.com/browse/from/lynch Yeah, I get a buck. So? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php