Postgres General List,
I am stumped trying to prevent an overflowing UDP buffer on a standby Postgres
service. Any help would be most appreciated.
~~~~~~~~
Essentially a UDP buffer associated with the pg_standby process on my localhost
interface gradually fills up once I start Postgres until it hits its maximum
capacity and then proceeds to steadily drop packets. A restart of Postgres (of
course) clears the buffer, but then it begins filling up again.
As far as I can tell, this is not actually causing any problems. (It is only
happening to the standby service, and failover data recovery shows nothing
missing.) Nevertheless, I don't want any buffers to overflow.
(I have also posted this question to ServerFault (http://serverfault.com/questions/564905/udp-overflow-udp-drops-on-standby-postgres-service). That posting has even more detail than I have provided below, such as how I identified pg_standby by querying the /proc files.)
~~~~~~~~
==Salient points==:
a) by querying "/proc" information for UDP I can see non-empty buffers, and identify the "pg_standby" process as the culprit
b) the overflow occurs even when my firewalls on both servers (iptables) are shut down
c) my UDP buffers at 16MB+ seem more than big enough. I could make them larger but that would only mask the problem
d) online discussions of similar problems seem to finger either older versions of Postgres or the Statistics Collector; to rule this out I have tried to turn off all statistics collection (track_activites/counts = off), but the problem continues:
e) a verbose wire sniff of the UDP packet shows nothing useful
f) there is not a great deal of database activity (e.g. roughly one 16MB WAL file is replicated from the primary to the secondary service every 45 minutes)
g) I formerly ran Postgres 8.3.5, with an otherwise identical setup; this problem only began when I upgraded to 9.1.9
~~~~~~~~
==Background on my setup==:
-- two CentOS 6.4 x86_64 bit systems (VMs), each running Postgres 9.1.9, each in a geographically separated (<50 miles) datacenter
-- Postgres is active on my primary server and running in standby mode on my backup:
---- the backup Postgres service is receiving its data two ways:
------ as a warm standby processing WAL files via log shipping
------ on failover the current WAL file on the primary (not yet shipped) is recovered from a DRBD partition synced from the primary box
-- nothing else (of consequence) runs on these boxes except Postgres
~~~~~~~~
Thanks,
Daniel