Re: Benchmark comparing PostgreSQL, MySQL and Oracle

Robert Haas <robertmhaas@xxxxxxxxx> · Fri, 20 Feb 2009 16:54:58 -0500

On Fri, Feb 20, 2009 at 4:34 PM, Jonah H. Harris <jonah.harris@xxxxxxxxx> wrote:
> On Fri, Feb 20, 2009 at 3:40 PM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
>>
>> ISTM you are the one throwing out unsubstantiated assertions without
>> data to back it up.  OP ran benchmark. showed hardware/configs, and
>> demonstrated result.  He was careful to hedge expectations and gave
>> rationale for his analysis methods.
>
> As I pointed out in my last email, he makes claims about PG being faster
> than Oracle and MySQL based on his results.  I've already pointed out
> significant tuning considerations, for both Postgres and Oracle, which his
> benchmark did not take into account.
>
> This group really surprises me sometimes.  For such a smart group of people,
> I'm not sure why everyone seems to have a problem pointing out design flaws,
> etc. in -hackers, yet when we want to look good, we'll overlook blatant
> flaws where benchmarks are concerned.

The biggest flaw in the benchmark by far has got to be that it was
done with a ramdisk, so it's really only measuring CPU consumption.
Measuring CPU consumption is interesting, but it doesn't have a lot to
do with throughput in real-life situations.  The benchmark was
obviously constructed to make PG look good, since the OP even mentions
on the page that the reason he went to ramdisk was that all of the
databases, *but particularly PG*, had trouble handling all those
little writes.  (I wonder how much it would help to fiddle with the
synchronous_commit settings.  How do MySQL and Oracle alleviate this
problem and we can usefully imitate any of it?)

Still, if you read his conclusions, he admits that he's just trying to
show that they're in the same ballpark, and that might well be true,
even with the shortcomings of the tests.

Personally, I'm not as upset as you seem to be about the lack of
perfect tuning.  Real-world tuning is rarely perfect, either, and we
don't know that his tuning was bad.  We do know that whatever tuning
he did was not adequately documented, and we can suspect that items
mentioned were not tuned, but we really don't know that.  We have
plenty of evidence from these lists that fiddling with shared_buffers
(either raising or even sometimes lowering it), page and tuple costs,
etc. can sometimes produce dramatic performance changes.  But that
doesn't necessarily tell you anything about what will happen in a real
life application with a more complex mix of queries where you can't
optimize for the benchmark.

>> If you think he's wrong, instead of picking on him why don't you run
>> some tests showing alternative results and publish them...leave off
>> the oracle results or use a pseudo-name or something.
>
> One of these days I'll get some time and post my results.  I'm just pointing
> out obvious flaws in this benchmark.  If Sergio wants to correct them and/or
> qualify them, that's cool with me.  I just don't like people relying on
> questionable and/or unclear data.

I'd love to see more results.  Even if they're not 100% complete and
correct they would give us more of a notion than we have now of where
more work is needed.  I was interested to see that Oracle was the
runaway winner for bulk data load because I did some work on that a
few months back.  I suspect a lot more is needed there, because the
work I did would only help with create-table-as-select or copy, not
retail insert, and even at that I know that the cases I did handle
have room for further improvement.

I am not certain which database is the fastest and suspect there is no
one answer.  But if we get some information that helps us figure out
where we can improve, that is all to the good.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance