Are you certain that there are no relevant errors in the database logs (on both master & slave)? Also, are you sure that you didn't misconfigure logging such that errors wouldn't appear? On Thu, Aug 15, 2013 at 11:45 AM, Andrew Berman <rexxe98@xxxxxxxxx> wrote: > Hi Lonni, > > Yes, I am using PG 9.1.9. > Yes, 1 slave syncing from the master > CentOS 6.4 > I don't see any network or hardware issues (e.g. NIC) but will look more > into this. They are communicating on a private network and switch. > > I forgot to mention that after I restart the slave, everything syncs right > back up and all if working again so if it is a network issue, the > replication is just stopping after some hiccup instead of retrying and > resuming when things are back up. > > Thanks! > > > > On Thu, Aug 15, 2013 at 11:32 AM, Lonni J Friedman <netllama@xxxxxxxxx> > wrote: >> >> I've never seen this happen. Looks like you might be using 9.1? Are >> you up to date on all the 9.1.x releases? >> >> Do you have just 1 slave syncing from the master? >> Which OS are you using? >> Did you verify that there aren't any network problems between the >> slave & master? >> Or hardware problems (like the NIC dying, or dropping packets)? >> >> >> On Thu, Aug 15, 2013 at 11:07 AM, Andrew Berman <rexxe98@xxxxxxxxx> wrote: >> > Hello, >> > >> > I'm having an issue where streaming replication just randomly stops >> > working. >> > I haven't been able to find anything in the logs which point to an >> > issue, >> > but the Postgres process shows a "waiting" status on the slave: >> > >> > postgres 5639 0.1 24.3 3428264 2970236 ? Ss Aug14 1:54 >> > postgres: >> > startup process recovering 000000010000053D0000003F waiting >> > postgres 5642 0.0 21.4 3428356 2613252 ? Ss Aug14 0:30 >> > postgres: >> > writer process >> > postgres 5659 0.0 0.0 177524 788 ? Ss Aug14 0:03 >> > postgres: >> > stats collector process >> > postgres 7159 1.2 0.1 3451360 18352 ? Ss Aug14 17:31 >> > postgres: >> > wal receiver process streaming 549/216B3730 >> > >> > The replication works great for days, but randomly seems to lock up and >> > replication halts. I verified that the two databases were out of sync >> > with >> > a query on both of them. Has anyone experienced this issue before? >> > >> > Here are some relevant config settings: >> > >> > Master: >> > >> > wal_level = hot_standby >> > checkpoint_segments = 32 >> > checkpoint_completion_target = 0.9 >> > archive_mode = on >> > archive_command = 'rsync -a %p foo@foo:/var/lib/pgsql/9.1/wals/%f >> > </dev/null' >> > max_wal_senders = 2 >> > wal_keep_segments = 32 >> > >> > Slave: >> > >> > wal_level = hot_standby >> > checkpoint_segments = 32 >> > #checkpoint_completion_target = 0.5 >> > hot_standby = on >> > max_standby_archive_delay = -1 >> > max_standby_streaming_delay = -1 >> > #wal_receiver_status_interval = 10s >> > #hot_standby_feedback = off >> > >> > Thank you for any help you can provide! >> > >> > Andrew >> > -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general