Re: More activity in pg_stat_activity

Erik Jones <erik@xxxxxxxxxx> · Mon, 08 Jan 2007 10:32:00 -0600

Tom Lane wrote:
Erik Jones <erik@xxxxxxxxxx> writes:

Before migrating to 8.2, even during peak times, unless queries were 
seriously stacking (not completing in a timely manner), we'd see at most 
50 - 100 queries active at any given time (we did have 
stats_command_string = on).  Since the migration, during peak times it's 
not uncommon to see the query count vary radically between around 80 and 
the upper 400s (we have max_connections set to 512).  This variation is 
per second and when the count is high, the vast majority listed are 
sub-second.  It would seem that this is due to some fine tuning 
somewhere on th backside of 8.2 versus previous versions.  Was there 
previously a more limited precision to the query lengths that would  be 
picked up by the pg_stat_activity view? 

Hmm ... 8.2 has a completely rewritten pg_stat_activity implementation,
but I'm not sure why it would show more active queries.  The old code
essentially showed you snapshots taken every half second, whereas the
new code gives you The Truth as of the start of your transaction.
So it should be more up-to-date but I'm not seeing why the old method
would have capped the number of active entries to less than what you see
now.  Maybe there was some weird scheduling interaction before?  (As in,
the stats collector was more likely to be able to run and push out its
snapshot when there were fewer active backends?)  Or maybe the stats
subsystem is just reflecting reality, and some other change(s) elsewhere
in 8.2 allow it to achieve higher concurrency than you got before.
I dunno, but it would be worth looking closer to try to figure out
what the story is.  Do you have numbers from before from other
monitoring tools such as vmstat, that you could compare to 8.2?

Alas, no, we don't.  From a purely interrogative standpoint, I wish we 
did as understanding these things is never less than worthwhile.  
Luckily, from our production standpoint, we always knew before that 
weren't seeing the "whole picture" when we queried pg_stat_activity as 
when we clocked the number of transactions we do in an hour to be over 
three million (three months ago), we realized that the vast majority of 
queries run through the system weren't registering, we assumed due them 
them completing virtually instantaneously.  Then, the only time the 
"active query count" would come anywhere near our max connection setting 
was when queries were stacking, or taking forever to complete.  I must 
say that we had an initial freak-out point when we saw those numbers 
climbing and jumping around until we saw that the vast majority were 
running in sub-second time.

One question regarding my previous thread about the 8.2 client tools.  
We have yet to have time (personal as well as usage pattern constraints) 
to dump our schema to analyze it for any possible discrepencies and 
clock schema queries.  Is there any reason we couldn't use the 8.1 
pg_dump facility until such a time as we can figure out our issues with 
the 8.2 pg_dump client (and psql, etc...)?

--
erik jones <erik@xxxxxxxxxx>
software development
emma(r)