> -----Original Message-----
> From: ldh@xxxxxxxxxxxxxxxxxx <ldh@xxxxxxxxxxxxxxxxxx>
> Sent: Saturday, December 4, 2021 14:18
> To: Justin Pryzby <pryzby@xxxxxxxxxxxxx>
> Cc: pgsql-performance@xxxxxxxxxxxxxx
> Subject: RE: An I/O error occurred while sending to the backend (PG 13.4)
>
>
> > -----Original Message-----
> > From: Justin Pryzby <pryzby@xxxxxxxxxxxxx>
> > Sent: Saturday, December 4, 2021 12:59
> > To: ldh@xxxxxxxxxxxxxxxxxx
> > Cc: pgsql-performance@xxxxxxxxxxxxxx
> > Subject: Re: An I/O error occurred while sending to the backend (PG
> > 13.4)
> >
> > On Sat, Dec 04, 2021 at 05:32:10PM +0000, ldh@xxxxxxxxxxxxxxxxxx
> > wrote:
> > > I have a data warehouse with a fairly complex ETL process that has
> > been running for years now across PG 9.6, 11.2 and now 13.4 for the
> > past couple of months. I have been getting the error "An I/O error
> > occurred while sending to the backend" quite often under load in 13.4
> > which I never used to get on 11.2. I have applied some tricks,
> particularly
> > with the socketTimeout JDBC configuration.
> > >
> > > So my first question is whether anyone has any idea why this is
> > happening? My hardware and general PG configuration have not
> > changed between 11.2 and 13.4 and I NEVER experienced this on 11.2
> in
> > about 2y of production.
> > >
> > > Second, I have one stored procedure that takes a very long time to
> run
> > (40mn more or less), so obviously, I'd need to set socketTimeout to
> > something like 1h in order to call it and not timeout. That doesn't seem
> > reasonable?
> >
> > Is the DB server local or remote (TCP/IP) to the client?
> >
> > Could you collect the corresponding postgres query logs when this
> > happens ?
> >
> > It'd be nice to see a network trace for this too. Using tcpdump or
> > wireshark.
> > Preferably from the client side.
> >
> > FWIW, I suspect the JDBC socketTimeout is a bad workaround.
> >
> > --
> > Justin
>
> It's a remote server, but all on a local network. Network performance is I
> am sure not the issue. Also, the system is on Windows Server. What are you
> expecting to see out of a tcpdump? I'll try to get PG logs on the failing query.
>
> Thank you,
> Laurent.
>
>
>
>
Hello Justin,
It has been ages! The issue has been happening a bit more often recently, as much as once every 10 days or so. As a reminder, the set up is Postgres 13.4 on Windows Server with 16cores and 64GB memory.
I can't understand why you are still using 13.4?
[1] There is a long discussion about the issue with 13.4, the project was made to fix a DLL bottleneck.
Why you not use 13.6?
regards,
Ranier Vilela