Search Postgresql Archives

Re: PGError: server closed the connection unexpectedly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



My biggest problem is the lack of any real error message on the server.
   I don't see anything wrong in the system logs, and there's no core
file in the /var/postgresql directory.


Are you sure core files are enabled; i.e. the server is running with
'ulimit -c unlimited' ?

Yes:

$ whoami
_postgresql
$ ulimit -c
unlimited


   I did a 'vacuumdb -afz' just as
a shot in the dark, without affect.  Pretty much all I see in the logs
is this:

<USER%DB x.y.z.a(51478) 487>LOG:  unexpected EOF on client connection


This means a client is dying or closing its connection prematurely,
and would seem to be a different problem.  It shouldn't ever cause the
server to crash.

If the server is crashing - I have absolutely no info to suggest that, except possibly this: at a later time I did a tcpdump of one of these sessions, for lack of better ideas, and saw this near the end:

<snip lots of tcpdump output>
16:32:19.523842 clam.int.geekisp.com.15245 >
morningsun.int.geekisp.com.postgresql: P 17389:18036(647) ack 9323 win
16384 <nop,nop,timestamp 133396667 3702263885> (DF)
16:32:19.524289 morningsun.int.geekisp.com.postgresql >
clam.int.geekisp.com.15245: P 9323:9343(20) ack 18036 win 17376
<nop,nop,timestamp 3702263885 133396667> (DF)
16:32:19.562544 clam.int.geekisp.com.15245 >
morningsun.int.geekisp.com.postgresql: P 18036:18041(5) ack 9343 win
16384 <nop,nop,timestamp 133396667 3702263885> (DF)
******16:32:19.566662 morningsun.int.geekisp.com.postgresql >
clam.int.geekisp.com.15245: F 9343:9343(0) ack 18041 win 17376
<nop,nop,timestamp 3702263885 133396667> (DF)
16:32:19.567043 clam.int.geekisp.com.15245 >
morningsun.int.geekisp.com.postgresql: . ack 9344 win 16384
<nop,nop,timestamp 133396667 3702263885> (DF)
16:32:19.649445 clam.int.geekisp.com.15245 >
morningsun.int.geekisp.com.postgresql: P 18041:18851(810) ack 9344 win
16384 <nop,nop,timestamp 133396667 3702263885> (DF)
16:32:19.649468 morningsun.int.geekisp.com.postgresql >
clam.int.geekisp.com.15245: R 545360451:545360451(0) win 0 (DF)
16:32:19.649652 clam.int.geekisp.com.15245 >
morningsun.int.geekisp.com.postgresql: F 18851:18851(0) ack 9344 win
16384 <nop,nop,timestamp 133396667 3702263885> (DF)
16:32:19.649665 morningsun.int.geekisp.com.postgresql >
clam.int.geekisp.com.15245: R 545360451:545360451(0) win 0 (DF)

The '*****' above is my own, to highlight the interesting part. Morningsun is the server, clam is the ruby client. Based on this, I
think the server *might* be dying first (hard to tell), since morningsun
is the one who sends the FIN packet and thereby closes the connection. (Sorry for the horrible wrapping).

Again, having trouble making sure.

Googling turned up a few reports suggesting bad hardware, or corrupted
indexes, but I don't think that's the case here.

Any starting points or ideas would be greatly appreciated.


Make sure the server is able to dump a core file, and perhaps crank up
the logging level.

Just to verify that a core dump is possible for the postgresql user, I wrote a tiny C program that just calls abort(). Sure enough, running it gives me a core file.

I cranked the debug level up to 5 (man that's a lot), and here's what I think is the relevent chunk:

<geekispv2%geekisp-v2-test 192.168.4.38(31992) 9155>STATEMENT: UPDATE job_members SET "start_time" = NULL, "exit_code" = NULL, "create_time" = '2006-01-07 17:10:31', "job_id" = 30, "command" = '--- !ruby/object:GeekISP::ShellCommand
        args:
        command: /bin/ls /tmp/
        exit_code:
        logger: !ruby/object:Logger
          datetime_format:
          level: 0
          logdev: !ruby/object:Logger::LogDevice
            dev: !ruby/object:File {}
filename: /mnt/scratch/dave/control_panel/trunk/config/../log/test.log
            shift_age: 0
            shift_size: 1048576
          progname:
        run_as:
        stderr:
        stdin:
stdout: ', "finish_time" = NULL, "last_update_time" = NULL, "host_id" = 4, "status" = 'in_progress', "output" = NULL WHERE job_member_id = 105 <geekispv2%geekisp-v2-test 192.168.4.38(31992) 9155>DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 10290432/1/0, nestlvl: 1, children: <> <geekispv2%geekisp-v2-test 192.168.4.38(31992) 9155>STATEMENT: UPDATE job_members SET "start_time" = NULL, "exit_code" = NULL, "create_time" = '2006-01-07 17:10:31', "job_id" = 30, "command" = '--- !ruby/object:GeekISP::ShellCommand
        args:
        command: /bin/ls /tmp/
        exit_code:
        logger: !ruby/object:Logger
          datetime_format:
          level: 0
          logdev: !ruby/object:Logger::LogDevice
            dev: !ruby/object:File {}
filename: /mnt/scratch/dave/control_panel/trunk/config/../log/test.log
            shift_age: 0
            shift_size: 1048576
          progname:
        run_as:
        stderr:
        stdin:
stdout: ', "finish_time" = NULL, "last_update_time" = NULL, "host_id" = 4, "status" = 'in_progress', "output" = NULL WHERE job_member_id = 105
<geekispv2%geekisp-v2-test 192.168.4.38(31992) 9155>DEBUG:  proc_exit(0)
<geekispv2%geekisp-v2-test 192.168.4.38(31992) 9155>DEBUG:  shmem_exit(0)
<geekispv2%geekisp-v2-test 192.168.4.38(31992) 9155>LOG: disconnection: session time: 0:00:00.84 user=geekispv2 database=geekisp-v2-test host=192.168.4.38 port=31992
<geekispv2%geekisp-v2-test 192.168.4.38(31992) 9155>DEBUG:  exit(0)
<%  7303>DEBUG:  reaping dead processes
<%  7303>DEBUG:  server process (PID 9155) exited with exit code 0

That looks to me like a clean and normal exit. This is pointing more and more towards the client in the ruby case, isn't it?

Regards,
--
Dave Steinberg
http://www.geekisp.com/
http://www.steinbergcomputing.com/


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux