Hi Tom,
Appreciate your inputs. Please find my comments inline below.
> We are using PostgreSQL 11 wherein intermittently the below exception is
> popping up, causing our application to lose connection with the database.
> It isn't reconnecting until the application is restarted.
> org.postgresql.util.PSQLException: An I/O error occurred while sending
> to the backend.
That certainly looks like loss of network connection. Had the connection
been sitting idle for awhile before this query attempt?
- We are sending requests continuously using Jmeter and the exceptions are interspersed. Out of 100 say 8-9 requests are getting this exception and there is no lag between them. The connections I think are being kept open after the testing is done, but shouldn't the error come against the first response when we are reopening for test. The exceptions are coming after 10-15 requests.
> We have checked the PostgreSQL logs in detail, however we are unable to
> find any significant errors related to this issue.
I'd expect that the backend would eventually notice the dead connection.
But the timeout before it does so might be completely different from the
time at which the client notices the dead connection, so the relationship
might not be very obvious.
- Initially I was seeing connection termination error in the logs. However, currently this exception is not breaking the connectivity so no errors are getting logged in the database.
> All the servers are present in the same region and building.
Doesn't mean there's not routers or firewalls between them. I'd start
by looking for network timeouts, and possibly configuring the server
to send TCP keepalives more aggressively. (In this case it might be
HAProxy that needs to be sending keepalives ... don't know what options
it has for that.)
- I have made the below changes in our HAProxy server.
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20
Currently we are testing to see whether this did the trick.
regards, tom lane