Re: standby not cleaning up WAL files often enough

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stupid question but it needs to be asked.

Do you have this in your recovery.conf file ?

archive_cleanup_command = 'pg_archivecleanup path to archived files %r'

On Wed, Apr 22, 2015 at 4:22 PM, Jeff Frost <jeff@xxxxxxxxxxxxx> wrote:
I’ve got an interesting situation on a new 9.4.1 streaming replica.

Occasionally it will get disconnected and need to catch up by pulling files from the wal-archive via wal-e.  This works fine and most of the time it will catch up and reconnect.

Ocasionally though, it will run the 50G pg_xlog partition out of disk space because it doesn’t seem to clean up the WAL files nearly fast enough.  For those who haven’t done the math in their head, that’s a bit over 3,000 WAL files.

checkpoint_segments = 33
checkpoint_completion_target = 0.9
checkpoint_timeout = 10min

Here’s how the logs look:

2015-04-22 20:48:34.331 GMT,,,32645,,553298b3.7f85,2666,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 382002 buffers (36.4%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=197.586 s, sync=0.253 s, total=199.001 s; sync files=397, longest=0.146 s, average=0.000 s",,,,,,,,,""
2015-04-22 20:52:01.176 GMT,,,32645,,553298b3.7f85,2669,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 395868 buffers (37.8%); 0 transaction log file(s) added, 428 removed, 67 recycled; write=205.487 s, sync=0.281 s, total=206.843 s; sync files=510, longest=0.241 s, average=0.000 s",,,,,,,,,""
2015-04-22 20:55:13.082 GMT,,,32645,,553298b3.7f85,2672,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 369073 buffers (35.2%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=190.356 s, sync=0.469 s, total=191.903 s; sync files=481, longest=0.363 s, average=0.000 s",,,,,,,,,""
2015-04-22 20:58:30.610 GMT,,,32645,,553298b3.7f85,2675,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 380614 buffers (36.3%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=196.295 s, sync=0.150 s, total=197.526 s; sync files=478, longest=0.062 s, average=0.000 s",,,,,,,,,""
2015-04-22 21:02:01.334 GMT,,,32645,,553298b3.7f85,2678,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 396343 buffers (37.8%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=209.380 s, sync=0.192 s, total=210.722 s; sync files=444, longest=0.161 s, average=0.000 s",,,,,,,,,""
2015-04-22 21:05:33.490 GMT,,,32645,,553298b3.7f85,2681,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 404200 buffers (38.5%); 0 transaction log file(s) added, 428 removed, 67 recycled; write=210.967 s, sync=0.182 s, total=212.154 s; sync files=505, longest=0.142 s, average=0.000 s",,,,,,,,,""
2015-04-22 21:08:37.649 GMT,,,32645,,553298b3.7f85,2684,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 417375 buffers (39.8%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=182.664 s, sync=0.391 s, total=184.157 s; sync files=448, longest=0.198 s, average=0.000 s",,,,,,,,,""
2015-04-22 21:08:47.816 GMT,,,32645,,553298b3.7f85,2687,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 914929 buffers (87.3%); 0 transaction log file(s) added, 485 removed, 10 recycled; write=5.511 s, sync=3.434 s, total=10.165 s; sync files=459, longest=0.484 s, average=0.007 s",,,,,,,,,""
2015-04-22 21:10:07.464 GMT,,,32645,,553298b3.7f85,2690,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 8421 buffers (0.8%); 0 transaction log file(s) added, 395 removed, 67 recycled; write=70.043 s, sync=0.278 s, total=71.243 s; sync files=182, longest=0.234 s, average=0.001 s",,,,,,,,,”"

All of those are started by "xlog" and not by "time'.

At first I was thinking perhaps wal-e was prefetching way too many WAL files, but it appears to only pre-fetch 8 by default:


Also, I verified that there were almost 3k WAL files in the pg_xlog directory itself.

I guess my question is: how can I avoid this behavior other than just continuing to grow that pg_xlog partition?





--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux