On Fri, Jun 05, 2020 at 10:57:46PM +0200, Aleš Zelený wrote: > we are using logical replication for more than 2 years and today I've found > new not yet know error message from wal receiver. The replication was in > catchup mode (on publisher side some new tables were created and added to > publication, on subscriber side they were missing). This comes from pqCheckInBufferSpace() in libpq when realloc() fails, most probably because this host ran out of memory. > Repeated several times, finally it proceeded and switch into streaming > state. The OS has 64GB RAM, OS + database instance are using usually 20GB > rest is used as OS buffers. I've checked monitoring (sampled every 10 > seconds) and no memory usage peak was visible, so unless it was a very > short memory usage peak, I'd not expect the system running out of memory. > > Is there something I can do to diagnose and avoid this issue? Does the memory usage increase slowly over time? Perhaps it was not a peak and the memory usage was not steady? One thing that could always be tried if you are able to get a rather reproducible case would be to use valgrind and check if it is able to detect any leaks. And I am afraid that it is hard to act on this report without more information. -- Michael
Attachment:
signature.asc
Description: PGP signature