Simon Riggs <simon.riggs@xxxxxxxxxxxxxxxx> writes: > On Thu, 3 Feb 2022 at 06:25, Michael Harris <harmic@xxxxxxxxx> wrote: >> Some of these functions trigger fetching of remote resources, for >> which a timeout is set using `alarm`. The function unfortunately does >> not re-establish any pre-existing interval timers after it is done, >> which leads to postgresql missing it's own expected alarm signal. >> >> The reason that this was not affecting us on previous postgres >> versions was this commit: >> >> https://github.com/postgres/postgres/commit/09cf1d52267644cdbdb734294012cf1228745aaa#diff-b12a7ca3bf9c6a56745844c2670b0b28d2a4237741c395dda318c6cc3664ad4a >> >> After this commit, once an alarm is missed, that backend never sets >> one again, so no timeouts of any kind will work. Therefore, the >> deadlock detector was never being run. Prior to that, the next time >> any timeout was set by the backend it would re-establish it's timer. >> >> We will of course fix our own code to prevent this issue, but I am a >> little concerned at the above commit as it reduces the robustness of >> postgres in this situation. Perhaps I will raise it on the >> pgsql-hackers list. > Hmm, so you turned off Postgres' alarms so they stopped working, and > you're saying that is a robustness issue of Postgres? If Michael's analysis were accurate, I'd agree that there is a robustness issue, but I don't think there is. See timeout.c:220: /* * Get the time remaining till the nearest pending timeout. If it is * negative, assume that we somehow missed an interrupt, and force * signal_pending off. This gives us a chance to recover if the * kernel drops a timeout request for some reason. */ nearest_timeout = active_timeouts[0]->fin_time; if (now > nearest_timeout) { signal_pending = false; /* force an interrupt as soon as possible */ secs = 0; usecs = 1; } Now admittedly we don't have a good way to test this stanza, but it should result in re-establishing the timer interrupt the next time any timeout.c API is invoked after a missed interrupt. I don't see anything more that we could or should do. We're not going to issue setitimer() after every user-defined function call. regards, tom lane