On Sun, Dec 21, 2008 at 10:56 PM, Gregory Stark <stark@xxxxxxxxxxxxxxxx> wrote: > Mark Wong <markwkm@xxxxxxxxx> writes: > >> On Dec 20, 2008, at 5:33 PM, Gregory Stark wrote: >> >>> "Mark Wong" <markwkm@xxxxxxxxx> writes: >>> >>>> To recap, dbt2 is a fair-use derivative of the TPC-C benchmark. We >>>> are using a 1000 warehouse database, which amounts to about 100GB of >>>> raw text data. >>> >>> Really? Do you get conforming results with 1,000 warehouses? What's the 95th >>> percentile response time? >> >> No, the results are not conforming. You and others have pointed that out >> already. The 95th percentile response time are calculated on each page of the >> previous links. > > Where exactly? Maybe I'm blind but I don't see them. Here's an example: http://207.173.203.223/~markwkm/community6/dbt2/baseline.1000.1/report/ The links on the blog entries should be pointing to their respective reports. I spot checked a few and it seems I got some right. I probably didn't make it clear you needed to click on the results to see the reports. >> I find your questions a little odd for the input I'm asking for. Are you >> under the impression we are trying to publish benchmarking results? Perhaps >> this is a simple misunderstanding? > > Hm, perhaps. The "conventional" way to run TPC-C is to run it with larger and > larger scale factors until you find out the largest scale factor you can get a > conformant result at. In other words the scale factor is an output, not an > input variable. > > You're using TPC-C just as an example workload and looking to see how to > maximize the TPM for a given scale factor. I guess there's nothing wrong with > that as long as everyone realizes it's not a TPC-C benchmark. Perhaps, but we're not trying to run a TPC-C benchmark. We're trying to illustrate how performance changes with an understood OLTP workload. The purpose is to show how the system bahaves more so than what the maximum transactions are. We try to advertise the kit the and work for self learning, we never try to pass dbt-2 off as a benchmarking kit. > Except that if the 95th percentile response times are well above a second I > have to wonder whether the situation reflects an actual production OLTP system > well. It implies there are so many concurrent sessions that any given query is > being context switched out for seconds at a time. > > I have to imagine that a real production system would consider the system > overloaded as soon as queries start taking significantly longer than they take > on an unloaded system. People monitor the service wait times and queue depths > for i/o systems closely and having several seconds of wait time is a highly > abnormal situation. We attempt to illustrate the response times on the reports. For example, there is a histogram (drawn as a scatter plot) illustrating the number of transactions vs. the response time for each transaction. This is for the New Order transaction: http://207.173.203.223/~markwkm/community6/dbt2/baseline.1000.1/report/dist_n.png We also plot the response time for a transaction vs the elapsed time (also as a scatter plot). Again, this is for the New Order transaction: http://207.173.203.223/~markwkm/community6/dbt2/baseline.1000.1/report/rt_n.png > I'm not sure how bad that is for the benchmarks. The only effect that comes to > mind is that it might exaggerate the effects of some i/o intensive operations > that under normal conditions might not cause any noticeable impact like wal > log file switches or even checkpoints. I'm not sure I'm following. Is this something than can be shown by any stats collection or profiling? This vaguely reminds me of the the significant spikes in system time (and dips everywhere else) when the operating system is fsyncing during a checkpoint that we've always observed when running this in the past. > If you have a good i/o controller it might confuse your results a bit when > you're comparing random and sequential i/o because the controller might be > able to sort requests by physical position better than in a typical oltp > environment where the wait queues are too short to effectively do that. Thanks for the input. Regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance