On 2015-12-01 18:58:31 +0100, Peter J. Holzer wrote: > I suspect such an interaction because I cannot reproduce the problem > outside of a stored procedure. A standalone Perl script doing the same > requests doesn't get a timeout. > > I guess Alvaro is right: I should strace the postgres worker process > while it executes the stored procedure. The problem of course is that > it happens often enough be annoying, but rarely enough that it's not > easily reproducible. I did manage to catch a timeout once with strace in the mean time, although that one was much more straightforward and less mysterious than the original case: postgres process sends message, about 10 seconds later it receives a SIGALRM which interrupts an epoll, reply hasn't yet arrived, error message to client and log file. No waits in functions which shouldn't wait or messages which arrive much later than they were (presumably) sent. The strace doesn't show a reason for the SIGALRM, though. No alarm(2) or setitimer(2) system call (I connected strace to a running postgres process just after I got the prompt from "psql" and before I typed "select * from mb_search('export');" (I used a different (but very similar) stored procedure for those tests because it is much easier to find a search which is slow enough to trigger a timeout at least sometimes than a data request (which normally finishes in milliseconds)). So I guess my next task will be to find out where that SIGALRM comes from and/or whether I can just restart the zmq_msg_recv if it happens. hp -- _ | Peter J. Holzer | I want to forget all about both belts and |_|_) | | suspenders; instead, I want to buy pants | | | hjp@xxxxxx | that actually fit. __/ | http://www.hjp.at/ | -- http://noncombatant.org/
Attachment:
signature.asc
Description: Digital signature