Re: Postmaster Out of Memory

Jeff Gold <jgold@xxxxxxxxxxxxxxxx> · Fri, 24 Jun 2005 11:06:06 -0400

Tom Lane wrote:
I can absolutely, positively say that that dump is not from the parent
postmaster.  It's a backend.

That makes sense.  I'm still a bit puzzled about why new clients can't 
connect when the problem happens, though.  Does the parent postmaster 
need some resource from one of the backends in order to establish a new 
connection?  Perhaps this will allow us to understand which one is 
producing these dumps.

I can't really think what would blow out CacheMemoryContext like that.

I have some new information here.  In at least one case it turns out the 
statements that provoke the out of memory error are these:

  TRUNCATE TABLE events.current_hostpair1
  Query failed with message ERROR:  out of memory
  DETAIL:  Failed on request of size 268.

One of the things we did around the time we introduced the wrapper 
around pg_autovacuum was to change the way we process certain tables so 
that instead of dropping a table and re-creating it we would use 
TRUNCATE statements.  This is more efficient in our testing.

Could this practice be running one of the backends out of memory 
somehow?  Is there a less cache hungry alternative we could try?

This is quite uninformative.  Are you suggesting that there are many
many lines about indexes?  Many many lines about DynaHashTable?  Both?

I presented the start and the end of what seemed to my uninformed eye to 
be the relevant error messages, since posting all 46.7 megabytes seemed 
impolite.  :-)  According to grep there are 122034 lines that include 
the word "index" in any combination of case.  "DynaHashTable" appears a 
mere eleven times, all at the bottom.  There are 640582 lines in total.

In other words, I'm not sure which of these messages is important or 
what I should be looking for.  Any suggestions?

How many tables/indexes do you have in your database, anyway?

According to a quick-and-dirty script I hacked together we have 484 
tables and 141 indexes outside of the system resources created by 
PostgreSQL itself.  Here's the script:

  tables=0
  indexes=0
  for schema in $(echo '\dn' | psql mazu postgres | grep '|' | \
                  grep -v Name | sed 's/|.*//' | grep -v pg_ | \
                  grep -v information_schema); do
    tables=$(($tables + $(echo '\dt' $schema.\* | \
                          psql mazu postgres | grep rows | \
                          sed 's/.*(//;s/ rows).*//') ))
    indexes=$(($indexes + $(echo '\di' $schema.\* | \
                            psql mazu postgres | grep rows | \
                            sed 's/.*(//;s/ rows).*//') ))
  done
  echo tables = $tables
  echo indexes = $indexes

Could this process be running pg_dump, or something else that would try to
touch them all?

We only use pg_dump for manual backup procedures.  There is no evidence 
in our logs that anyone attempted that during the time that the problem 
occured.  The process that issues the TRUNCATE command works with only 
one of our schemas as far as I can tell.

Thanks for taking the time to respond.

                                                Jeff

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

              http://archives.postgresql.org