Re: Delay in completion of aggregation inserts when run in a single commit - PG 9.1.2

Craig Ringer <ringerc@xxxxxxxxxxxxx> · Thu, 05 Jul 2012 14:33:50 +0800

On 07/05/2012 02:30 AM, Akash Kodibail wrote:
-Staging table population happens in expected time. Anywhere from 
10-15 minutes for each process.

-Aggregation process almost never completes which is a set of 15 
insert queries. Entire process happens in a single commit (population 
of staging tables and aggregation).

When you mean "never completes" - can you be more specific?

Is the query not progressing - say, waiting on a lock?

Is it doing work, but performing very poorly?

Does it terminate with a failure? If so, error message?

If the query just keeps running but you're not sure what it is or isn't 
doing, you should examine pg_stat_activity and pg_locks. From 
pg_stat_activity you can get the backend process ID; examine that 
backend using system tools (ps, top, etc) to see if it's using CPU and 
whether it's doing disk I/O. From pg_locks you can get an idea of 
whether a query is stuck waiting on a lock.
Could you be kind enough to explain how differently queries run when 
in a “single commit” as opposed to running each of them individually?
There shouldn't be tons of difference. The first queries continue to 
hold locks while the others run, but since you're in the same 
transaction it won't affect subsequent queries, only concurrent ones.

If you have other concurrent work going on it's possible you're running 
into locking issues that you don't encounter when you release locks with 
individual commits.

Also, some resources can only be released when a transaction commits. If 
you're doing something like using huge numbers of savepoints or PL/PgSQL 
BEGIN ... EXCEPT blocks it's possible you're running into resource 
issues. The SAVEPOINT code has improved massively since 8.3 when I had 
similar issues, but savepoints still aren't free.

Start by looking at the locking situation, and by determining whether 
your query is making slow forward progress or is simply waiting.

Postgresql.conf remains untouched, except for log writing which has 
been made to “all”.

That's going to be VERY sub-optimal if you're bulk-loading data. You'll 
need to push checkpoint_segments right up. Check your logs; you're 
probably seeing warnings about checkpoints being too frequent. Hell, my 
checkpoint_segments is at 50 on my *laptop* - albeit a very fast 
SSD-equipped laptop that does lots of blk data uinit testing.

You'll want to push effective_cache_size up to 4 or 5 GB given you have 
three instances. Hard to say re shared_buffers, I have little experience 
with running multiple instances on one machine.

--
Craig Rin

--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin