On 15/02/2008, Richard Huxton <dev@xxxxxxxxxxxx> wrote: > Phoenix Kiula wrote: > > On 15/02/2008, Richard Huxton <dev@xxxxxxxxxxxx> wrote: > > > >> Ah, more new information! This does seem to point to the load, > >> particularly if it's exactly the same query each time. So what do > >> top/vmstat etc show for these "go-slow" periods? > > > > In included top and vmstat info in my other post yesterday, but here > > it is again: > > > Ah, you had a post yesterday! > > (goes away, searches for previous post) > http://archives.postgresql.org/pgsql-general/2008-02/msg00689.php > PG quitting sporadically!! > > Right, OK. Firstly, stop worrying about index usage and/or bloat. You > have unexplained process crashes to deal with first. There's no point in > looking at indexes until you figure out what is killing your processes. > > Secondly, a single line from vmstat isn't useful, you want to compare > what is happening when things are fine with when they aren't. Leave > vmstat 10 logging to a file so you can catch it. > > Thirdly, have you upgraded to the latest 8.2 (8.2.6) yet? > > I see you've reduced work_mem, that's good. > > Oh, you might as well lower max_connections from 150 too, there's no way > you can support that many concurrent queries anyway. > > > The fact that you're seeing various strange socket-related problems is > odd. As is the fact that logging doesn't seem to work for you. > > Are you sure the two sets of vmstat/top figures are from when PG was > crashing/running queries slow? Everything seems idle to me in those figures. No. They are the vmstat figures from when I was replying to your email. What will vmstat tell me and how should I set it up to do "vmstat 10 logging"? Btw, postgresql logging is working. But here're the kind of things I have in there: LOG: test message did not get through on socket for statistics collector LOG: disabling statistics collector for lack of working socket LOG: database system was shut down at 2008-02-15 06:12:10 CST LOG: checkpoint record is at 8/E785304C LOG: redo record is at 8/E785304C; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 0/296892698; next OID: 97929 LOG: next MultiXactId: 1; next MultiXactOffset: 0 LOG: database system is ready LOG: could not receive data from client: Connection reset by peer LOG: unexpected EOF on client connection LOG: could not receive data from client: Connection reset by peer LOG: unexpected EOF on client connection Now I don't know what is wrong or even where I should look. Postgresql is often taking quite a bit of memory and CPU resources. I've reduced work_mem to 10MB and Max_connections to 100. (Anyway, the old values were working just fine until recently!) The biggest problem: when I su into postgres user and do a psql to get into the PG console in my SSH, it takes a whole lot of time to come up! It used to come up in a jiffy earlier!!! It now shows me this error: ~ > psql: could not connect to server: Connection timed out Is the server running on host "localhost" and accepting TCP/IP connections on port 5432? Then, five minutes later, I can connect again! In less than a second! What gives? Finally, very simple queries like this one: select url, disable_in_statistics, id, user_known from links where alias = '1yqw7' and status = 'Y' limit 1 Which used to be server in "5 ms" (0.005 seconds) are now taking upwards of 200 seconds! Your suggestion to "Explain Analyze" -- =# explain analyze select url, disable_in_statistics, id, user_known from links where alias = '1yqw7' and status = 'Y' limit 1 ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..8.74 rows=1 width=113) (actual time=9.639..9.643 rows=1 loops=1) -> Index Scan using links2_alias_key on links (cost=0.00..8.74 rows=1 width=113) (actual time=9.630..9.630 rows=1 loops=1) Index Cond: ((alias)::text = '1yqw7'::text) Filter: (status = 'Y'::bpchar) Total runtime: 16.425 ms (5 rows) Now this is only when I have connected to the psql console, of course. Still, these queries are intermittently very slow! ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/