On Tuesday 01 May 2007 2:23 pm, Tom Lane wrote: > Well, it's going wrong here: > > socket(AF_INET, SOCK_STREAM, 0) .......................... = 4 > setsockopt(4, 0x6, TCP_NODELAY, 0x9fffffffffffe210, 4) ... = 0 > fcntl(4, F_SETFL, 65536) ................................. = 0 > fcntl(4, F_SETFD, 1) ..................................... = 0 > connect(4, 0x6000000000416ea0, 16) ....................... = 0 > getsockopt(4, SOL_SOCKET, SO_ERROR, 0x9fffffffffffe32c, > 0x9fffffffffffe338) = 0 close(4) > ................................................. = 0 > > The close() indicates we're into the failure path, so > evidently the getsockopt returned a failure indication (though > it's hard to tell what --- strerror() isn't providing anything > useful). What strikes me as odd about this is that the > connect() really should have returned EINPROGRESS or some > other failure code, because we're doing it in nonblock mode. A > zero return implies that the connection is already made, which > it shouldn't be if you're connecting to some other machine (if > this is a local connection then maybe it's sane, but I don't > see that here when testing loopback TCP connections). So I > wonder if connect() is blowing it here and claiming the > connection is ready when it's not quite yet. Another > possibility is that getsockopt() is returning bad data, which > smells a bit more like the sort of thing that might go wrong > in 64 vs 32 bit mode. It is indeed a local connection using PGHOST=`hostname`. That name maps to one of the external NIC IPs, not to the normal 127.0.0.1 loopback address. For context, I've seen this a number of times over the past couple years, from pgsql 7.3.x to 8.1.x, HPUX 11.00 to 11.23, 32-bit-only and 32/64 Itaniums, always via a local connection using `hostname` mapping to an external NIC. What it is about the reboots that triggers this remains a mystery. Ed