Re: Performance question for table updating (SOLVED)

"Jon Westcot" <jon@xxxxxxxxxxx> · Sat, 24 Nov 2007 04:03:53 -0700

Hi Rob, et al.:

----- Original Message -----
From: "Andrés Robinet" <agrobinet@xxxxxxxxxxxxx>
> > -----Original Message-----
> > From: Jon Westcot [mailto:jon@xxxxxxxxxxx]
> >
> > :: gigantic snip here::
> >
> >     So, long story short (oops -- too late!), what's the concensus
> > among the learned assembly here?  Is it faster to just UPDATE the
> > record if it already exists regardless of the fact that maybe only one
> > or two out of 75 or more fields changed versus testing each one of
> > those 75 fields to try and figure out which ones actually changed and
> > then only update those?
> >
> >     I look forward to reading all of your thoughts.
> >
> >     Sincerely,
> >
> >         Jon
>
> I don't know about consensus over here because I'm kind of newgie (stands
> for new geek, as opposed to newbie which stands for new ball breaker :D :D
> ). I don't know of your previous messages but I can tell you one story...
> Some time ago I got involved in a project that required geo-distance
> calculation (you know distance between two points with latitude and
> longitude). Basically I had to take a set of points and calculate the
> distance of each of those points to a given (reference) one. The math was
> something like the "square root of the sum of a constant times the square
> sin of..." well, I can't remember it, but the point is, it was a
complicated
> formula, which I thought it would allow for some optimizations in PHP.
> Accustomed to regular (compiled) programming languages I developed a set
of
> routines to optimize the task and went ahead and queried the database for
> the (say, 1000 records) dataset of points. Then applied the math to the
> points and the reference point and got the result... in about 5 minutes to
> my (disgusting) surprise.
> Then I grabbed the MySQL manual, built a "non-optimized" version of the
> formula to put directly in the SQL query and get the "shortest distance"
> (which was my goal in the end) calculated by MySQL right away. I thought
> "ok, I'll prepare a cup of coffee to wait for MySQL to finish the
> calculation". To my surprise the query returned the expected result in
less
> than 2 seconds.
> My logic was (wrongly) the following: PHP is a programming language, SQL
is
> a data access language; I'll get the data using MySQL and do the math
using
> PHP. But I forgot PHP is an interpreted language, that a number is more
than
> a number to PHP, but a ZVAL_<whatever> object behind the scenes. I forgot
> about the memory and the time required to build those objects when one
> retrieves data out of a database server. I forgot about parsing time, and
> "support logic and safety checks" in the language that overkill any
attempt
> to build TDCPL (Too Damn Complex Programming Logic) in PHP.
> So, now, when I have to do some logic stuff to the retrieved data, I first
> check "how much" I can push into the query itself, to get little or
nothing
> of programming logic in PHP after retrieving (before storing) the data.
> All that said, I'd give a shot to the MySQL REPLACE function (I wouldn't
> even branch the code to use INSERT or UPDATE depending on the record
already
> existing or not, If you have a primary key, all you need is REPLACE). But,
> PLEASE LOOK AT THE GOTCHAS (like set col_name=col_name+1). Furthermore, If
> those data files were to be uploaded by me (I mean, me, the coder, not the
> end user), I'd build (use) a program to convert them to SQL sentences in
my
> desktop PC where I can use faster programming languages and I can wait for
> five minutes of heavy processing (instead of overkilling the server for
five
> minutes which will slow down every other service in there).
> In the end it depends on your requirements and where you get the data from
> and if and how you want to automate the task (I didn't get your previous
> messages, I got subscribed recently, if you can send me a link to those
> ones... great!)
>
> Rob

    Thanks for the comments and suggestions.  Prior to receiving your note,
I went back and did a bit of checking on my code.  Turns out that the
problem was "hardware" related -- the infamous "Loose Screw Behind the
Keyboard."

    The problem actually boiled down to two quotation marks -- they were
present in the search code to see if a record with the specified key
existed, but were omitted in the WHERE clause of the UPDATE statement.  Said
update therefore refused to use the nice little index I'd provided for its
use and instead scanned through the entire table to find the record in
question.

    Moral of the story?  Two, really.  First, ensure you always reference
values in the way most appropriate for their type.  Second, don't make your
idiocy public by asking stupid questions on a public forum. <g>  What's the
quote (probably attributed to Churchill)?  "It is better to be ignorant and
silent than to voice one's opinions and remove all doubt." ;)

    Jon

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php