Re: dbt-2 tuning results with postgresql-8.3.5

Gregory Stark <stark@xxxxxxxxxxxxxxxx> · Mon, 22 Dec 2008 06:56:24 +0000

Mark Wong <markwkm@xxxxxxxxx> writes:

> On Dec 20, 2008, at 5:33 PM, Gregory Stark wrote:
>
>> "Mark Wong" <markwkm@xxxxxxxxx> writes:
>>
>>> To recap, dbt2 is a fair-use derivative of the TPC-C benchmark.  We
>>> are using a 1000 warehouse database, which amounts to about 100GB of
>>> raw text data.
>>
>> Really? Do you get conforming results with 1,000 warehouses? What's  the 95th
>> percentile response time?
>
> No, the results are not conforming.  You and others have pointed that  out
> already.  The 95th percentile response time are calculated on each  page of the
> previous links.

Where exactly? Maybe I'm blind but I don't see them.

>
> I find your questions a little odd for the input I'm asking for.  Are  you
> under the impression we are trying to publish benchmarking  results?  Perhaps
> this is a simple misunderstanding?

Hm, perhaps. The "conventional" way to run TPC-C is to run it with larger and
larger scale factors until you find out the largest scale factor you can get a
conformant result at. In other words the scale factor is an output, not an
input variable.

You're using TPC-C just as an example workload and looking to see how to
maximize the TPM for a given scale factor. I guess there's nothing wrong with
that as long as everyone realizes it's not a TPC-C benchmark.

Except that if the 95th percentile response times are well above a second I
have to wonder whether the situation reflects an actual production OLTP system
well. It implies there are so many concurrent sessions that any given query is
being context switched out for seconds at a time.

I have to imagine that a real production system would consider the system
overloaded as soon as queries start taking significantly longer than they take
on an unloaded system. People monitor the service wait times and queue depths
for i/o systems closely and having several seconds of wait time is a highly
abnormal situation.

I'm not sure how bad that is for the benchmarks. The only effect that comes to
mind is that it might exaggerate the effects of some i/o intensive operations
that under normal conditions might not cause any noticeable impact like wal
log file switches or even checkpoints.

If you have a good i/o controller it might confuse your results a bit when
you're comparing random and sequential i/o because the controller might be
able to sort requests by physical position better than in a typical oltp
environment where the wait queues are too short to effectively do that.

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance