With this approach, I will be assuming that the query time does not change due to client location, which though reasonable, is still an assumption. If I could have tested without making this assumption (or any) , it would have been better.
But looks like there is no choice as getting to query time measurement for queries fired by clients is not possible.
I would still be firing concurrent clients across the different locations but measuring the 'psql timing' for only the queries fired on the database server. Will extrapolate the outlier % of the queries on database server (say queries that
take more than 200 ms due to flushing of checkpoints etc) to get to the total outliers.
This is good enough for the time being and will try it. If you can think of alternatives where I don't have to assume/extrapolate, please let me know.
Do you think changing log_destination to syslog may make a difference (Kevin mentioned even this timing is not totally immune from network effects but if possible to measure should be very close to the query time) ?
From: Kevin Grittner <Kevin.Grittner@xxxxxxxxxxxx>
To: Scott Marlowe <scott.marlowe@xxxxxxxxx>; A J <s5aly@xxxxxxxxx>
Cc: pgsql-admin@xxxxxxxxxxxxxx
Sent: Thu, September 2, 2010 2:31:24 PM
Subject: Re: Confused by 'timing' results
Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote:
> On Thu, Sep 2, 2010 at 11:34 AM, A J <s5aly@xxxxxxxxx> wrote:
>> The problem I am trying to solve is:
>> measure accurately both the database server time + network time
>> when several clients connect to the database from different
>> geographic location. All the clients hit the database
>> simultaneously with a long script each of insert/update/select
>> queries.
>
> Then that's what you should test. create long scripts, run them
> from different locales, and measure the overall time differences,
> if any, of the same file from different locales.
I'm inclined to agree with Scott. The effects of the network come
into play in several different ways, and I can't think of a better
way to isolate those effects from the query run time itself than to
run exactly the same queries on the server itself and from the
various remote locations. Subtract the server-based time from each
location's time to find the impact of the network. Doesn't that
address your problem fairly directly and accurately?
-Kevin