Tom Lane wrote:
I can absolutely, positively say that that dump is not from the parent
postmaster. It's a backend.
That makes sense. I'm still a bit puzzled about why new clients can't
connect when the problem happens, though. Does the parent postmaster
need some resource from one of the backends in order to establish a new
connection? Perhaps this will allow us to understand which one is
producing these dumps.
I can't really think what would blow out CacheMemoryContext like that.
I have some new information here. In at least one case it turns out the
statements that provoke the out of memory error are these:
TRUNCATE TABLE events.current_hostpair1
Query failed with message ERROR: out of memory
DETAIL: Failed on request of size 268.
One of the things we did around the time we introduced the wrapper
around pg_autovacuum was to change the way we process certain tables so
that instead of dropping a table and re-creating it we would use
TRUNCATE statements. This is more efficient in our testing.
Could this practice be running one of the backends out of memory
somehow? Is there a less cache hungry alternative we could try?
This is quite uninformative. Are you suggesting that there are many
many lines about indexes? Many many lines about DynaHashTable? Both?
I presented the start and the end of what seemed to my uninformed eye to
be the relevant error messages, since posting all 46.7 megabytes seemed
impolite. :-) According to grep there are 122034 lines that include
the word "index" in any combination of case. "DynaHashTable" appears a
mere eleven times, all at the bottom. There are 640582 lines in total.
In other words, I'm not sure which of these messages is important or
what I should be looking for. Any suggestions?
How many tables/indexes do you have in your database, anyway?
According to a quick-and-dirty script I hacked together we have 484
tables and 141 indexes outside of the system resources created by
PostgreSQL itself. Here's the script:
tables=0
indexes=0
for schema in $(echo '\dn' | psql mazu postgres | grep '|' | \
grep -v Name | sed 's/|.*//' | grep -v pg_ | \
grep -v information_schema); do
tables=$(($tables + $(echo '\dt' $schema.\* | \
psql mazu postgres | grep rows | \
sed 's/.*(//;s/ rows).*//') ))
indexes=$(($indexes + $(echo '\di' $schema.\* | \
psql mazu postgres | grep rows | \
sed 's/.*(//;s/ rows).*//') ))
done
echo tables = $tables
echo indexes = $indexes
Could this process be running pg_dump, or something else that would try to
touch them all?
We only use pg_dump for manual backup procedures. There is no evidence
in our logs that anyone attempted that during the time that the problem
occured. The process that issues the TRUNCATE command works with only
one of our schemas as far as I can tell.
Thanks for taking the time to respond.
Jeff
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org