Hi, Have a major reliability problem with out postgres installation here. Basically the backends seem to "hang" when the server is under non-trivial load. It can cope with ~200 concurrent connections (max is 1024) but anything over that and all the backends start to hang. By hang I mean they answer connection attempts and accept a SQL statement but take a LONG time to reply. During this time if I do a "df" on the server the df command will block as well when it comes to list the partition that postgres's database is stored on. The box does not appear to be CPU bound and there is little or no ioswap activity. There is however a LOT of disk writing activity on the postgres partition. Vitals: * Solaris 10 (SPARC) * Postgres 8.0.2 Solaris 10 has a tool called dtrace which reveals that pid #687 (a postgres backend) is doing a LOT of writing to a single file (iosnoop.d for you dtrace hackers): UID PID D BLOCK SIZE COMM PATHNAME 103 687 W 120384 98304 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 120384 98304 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 120384 98304 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 4521728 65536 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 4521728 65536 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 4521728 65536 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 4521904 131072 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 4522160 81920 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 4521904 131072 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 4522160 81920 postgres /export/data/pgsql/data/global/pgstat.tmp.687 103 687 W 4521904 212992 postgres /export/data/pgsql/data/global/pgstat.tmp.687 [...] # ps -fp 687 UID PID PPID C STIME TTY TIME CMD pgsql 687 686 1 Dec 20 ? 57:09 /usr/local/pgsql/bin/postmaster -D /export/data/pgsql/data (Hmmm... Interesting. The PID of the process doing all that writing is always "+1" of its parent PID. I take it that the first child spawned by postmaster is responsible for updating the pg_stats_activity table and others?) This database serves a website and most of its queries (almost all) will be read-only. So it's not clear why so much disk writing is going on. So we started to look at exactly what was doing the writing and that's when we found this file. We have postgres running on two other boxes, both Solaris, same version of Postgres, and they show similar activity. We think that maybe Postgres is saturating the disk and this is causing the "blocking". Mind you the iostat.d program that produced the output above was running at a time when there was low load and things were fine. So my questions are: 1. What is the pgstat.tmp file for and why does a single PID on each server write almost continuously to it? 2. Why does that file not really exist in the file system according to "ls"? 3. Where should I look to determine why Postgres is writing so much to that disk? Dan -- Daniel Austin (+61 2 9358 7983) Internet Technology Manager -- Confidential: This email is intended only for the person or entity to which it is addressed. The information contained in this email may be confidential and privileged. Any retransmission, review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this email in error, please let us know by reply and destroy all copies of it. Thank you.