Darron Harrison <darron@xxxxxxxxxxxxxxxx> writes: > I am currently seeing three WAL sender processes on the master, and > no stale replication connections on the slaves. > > To clarify, are you saying there should be two WAL sender processes > for each slave for a total of six? > Silly question... Have you verified that autovac is actually running? launcher still alive, not malconfig'd as to do nothing... etc? Silly question #2; supposing autovac is trying to do as intended, have you inspected the logs to insure that it's not erroring out on one or more tables, catalogs, indexes... etc? I've reade most of this thread and not seeing any indication if the problem is confined to just one object or several. FWIW > Darron > > > ---- On Thu, 26 Mar 2020 15:26:01 -0700 Rui DeSousa > <rui@xxxxxxxxxxxxx> wrote ---- > > > > > > On Mar 26, 2020, at 5:40 PM, Darron Harrison < > darron@xxxxxxxxxxxxxxxx> wrote: > > There is no current lag on the replicas. Replication does > traverse a firewall, but we have made no changes recently. > > I will say that one of the hot standbys was only recently > attached, and it seems like the issues started when we began > sending some longer running queries it's way. We have since > placed those queries back on the master, but the vacuum > issues remain. > > One side effect of whatever is happening, is that nightly > backups are taking twice as long as normal. > > > It could sill be a bad/sale replication session. If the firewall > drops the replication stream and does not send a reset packets > (bad pratice) then the replication session might still be > lingering on the Postgres server and holding on to a very old > xmin. Do you know if you have TCP/IP Keepalive enabled? I don’t > think in 9.2 replication sessions are listed in pg_stat_activity; > thus, you’ll have to look for TCP/IP connections to the replics > that should not exist. > > Check all the upstream servers for stale TCP/IP replication > connections; using netstat. I would also look at the system’s > process list for walsender processes to see if there more more > than there should be; i.e. two of them for single replica. > > If you do find one; the best way to terminate it is to drop the > TCP/IP connection. i.e. In FreeBSD it would be the command > “tcpdrop”; for Linux there are few utilities that do same -- I > just don’t recall the name of them. > > > > > > > > -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@xxxxxxxxxxx