> > The slave (I don't have control on the master) is using 2 NFS file > > systems, one for WALs and another one for the data, on Netapp controllers: > > > > dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs > > > (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=60 > > 0) > > > > dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs > > > (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=60 > > 0) > > You should use noatime to avoid unnecessary IO. > Just to mention that changing the mount points from: dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs (rw, actimeo=0,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600) dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs (rw, actimeo=0,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600) to dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs (rw,noatime,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600) dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs (rw,noatime,hard,nointr,rsize=65536,wsize=65536,tcp,timeo=600) it did have a big impact. The profile of the recovery process on terms of calls changed quite a lot: From: [postgres@itrac1202 tmp]$ strace -p 9596 -c Process 9596 attached - interrupt to quit Process 9596 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 78.73 0.217824 0 456855 381376 read 17.87 0.049453 0 515320 lseek 2.89 0.007989 12 669 669 poll 0.33 0.000912 25 36 open 0.07 0.000206 0 994 994 stat 0.05 0.000151 0 995 787 rt_sigreturn 0.05 0.000133 0 673 write 0.00 0.000000 0 36 close 0.00 0.000000 0 52 kill ------ ----------- ----------- --------- --------- ---------------- 100.00 0.276668 975630 383826 total To: [postgres@itrac1202 tmp]$ strace -p 9596 -c Process 9596 attached - interrupt to quit Process 9596 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 78.73 0.217824 0 456855 381376 read 17.87 0.049453 0 515320 lseek 2.89 0.007989 12 669 669 poll 0.33 0.000912 25 36 open 0.07 0.000206 0 994 994 stat 0.05 0.000151 0 995 787 rt_sigreturn 0.05 0.000133 0 673 write 0.00 0.000000 0 36 close 0.00 0.000000 0 52 kill ------ ----------- ----------- --------- --------- ---------------- 100.00 0.276668 975630 383826 total We did also increased the shared_buffers from 12 to 24GB. The lag has decreased most of the time: */10 * * * * /usr/local/pgsql/postgresql-9.2.9/bin/psql -U postgres -h /var/lib/pgsql/ -p 6600 -d puppetdb -c "SELECT now(), now() - pg_last_xact_replay_timestamp() AS time_lag" | perl -ne 'if (/\|\s+(\d{2}):(\d{2}):(\d{2})\.\d+/) {$hour=$1;$min=$2;$sec=$3; print $_;}' >> /tmp/lag929morememmount.log ... 2014-12-14 14:10:01.688947+01 | 00:00:00.096524 2014-12-14 14:20:01.798223+01 | 00:00:00.024083 2014-12-14 14:30:01.884448+01 | 00:00:00.420791 2014-12-14 14:40:01.960623+01 | 00:00:00.168318 2014-12-14 14:50:01.191487+01 | 00:00:00.163832 2014-12-14 15:00:02.146436+01 | 00:00:00.026934 2014-12-14 15:10:01.277963+01 | 00:00:00.332185 2014-12-14 15:20:01.353979+01 | 00:00:00.020616 2014-12-14 15:30:01.417092+01 | 00:00:00.584768 2014-12-14 15:40:01.575347+01 | 00:00:00.151685 2014-12-14 15:50:01.205507+01 | 00:00:00.102073 2014-12-14 16:00:01.321511+01 | 00:00:00.590677 2014-12-14 16:10:01.570474+01 | 00:00:00.182683 2014-12-14 16:20:01.640095+01 | 00:00:00.420185 2014-12-14 16:30:01.767033+01 | 00:00:00.015989 2014-12-14 16:40:01.849532+01 | 00:00:00.106296 2014-12-14 16:50:01.920876+01 | 00:00:00.258851 2014-12-14 17:00:02.000278+01 | 00:00:00.119841 2014-12-14 17:10:01.894227+01 | 00:00:00.091599 2014-12-14 17:20:01.61729+01 | 00:00:00.367367 2014-12-14 17:30:01.683326+01 | 00:00:00.103884 2014-12-14 17:40:01.755904+01 | 00:00:00.051262 2014-12-14 17:50:01.833825+01 | 00:00:00.06901 2014-12-14 18:00:01.901236+01 | 00:00:00.17467 2014-12-14 18:10:01.186283+01 | 00:00:00.214941 2014-12-14 18:20:01.145413+01 | 00:00:00.03517 2014-12-14 18:30:01.241746+01 | 00:00:00.207842 2014-12-14 18:40:01.299413+01 | 00:00:00.147878 2014-12-14 18:50:01.368541+01 | 00:00:00.393893 2014-12-14 19:00:01.430736+01 | 00:00:00.031226 2014-12-14 19:10:01.672117+01 | 00:05:03.512832 2014-12-14 19:20:01.9195+01 | 00:06:39.08761 2014-12-14 19:30:02.184486+01 | 00:00:00.307668 2014-12-14 19:40:01.227278+01 | 00:00:00.054831 2014-12-14 19:50:01.305485+01 | 00:00:00.425595 2014-12-14 20:00:01.410501+01 | 00:00:00.394526 2014-12-14 20:10:01.984196+01 | 00:00:00.388844 2014-12-14 20:20:01.031042+01 | 00:00:00.503092 2014-12-14 20:30:01.225871+01 | 00:00:00.241493 2014-12-14 20:40:01.305696+01 | 00:00:00.280656 2014-12-14 20:50:01.379617+01 | 00:00:00.151103 2014-12-14 21:00:01.468849+01 | 00:00:00.014412 2014-12-14 21:10:01.724514+01 | 00:00:00.147476 2014-12-14 21:20:01.799292+01 | 00:00:00.08696 2014-12-14 21:30:01.866336+01 | 00:00:00.035226 2014-12-14 21:40:01.942882+01 | 00:00:00.111701 2014-12-14 21:50:02.010419+01 | 00:00:00.215121 2014-12-14 22:00:01.110033+01 | 00:00:16.460612 2014-12-14 22:10:01.568286+01 | 00:00:00.077897 2014-12-14 22:20:01.682714+01 | 00:00:00.104112 2014-12-14 22:30:01.758958+01 | 00:00:00.061474 2014-12-14 22:40:01.970545+01 | 00:00:00.108613 2014-12-14 22:50:01.038908+01 | 00:00:00.039637 2014-12-14 23:00:01.120295+01 | 00:00:00.338731 2014-12-14 23:10:01.365371+01 | 00:00:00.680065 2014-12-14 23:20:01.423365+01 | 00:00:00.154614 2014-12-14 23:30:01.48998+01 | 00:00:00.014643 2014-12-14 23:40:01.569452+01 | 00:00:00.126961 2014-12-14 23:50:01.63047+01 | 00:00:00.303156 2014-12-15 00:00:01.278047+01 | 00:00:00.351391 2014-12-15 00:10:01.382566+01 | 00:00:00.012265 2014-12-15 00:20:01.444746+01 | 00:07:39.002651 2014-12-15 00:30:01.510413+01 | 00:16:13.476753 2014-12-15 00:40:01.97735+01 | 00:00:00.105011 2014-12-15 00:50:01.082528+01 | 00:01:10.313796 2014-12-15 01:00:01.124843+01 | 00:00:01.508016 2014-12-15 01:10:01.818415+01 | 00:00:00.082441 2014-12-15 01:20:01.961064+01 | 00:00:00.048221 2014-12-15 01:30:01.355472+01 | 00:00:00.37941 2014-12-15 01:40:01.42728+01 | 00:00:00.013836 2014-12-15 01:50:01.486446+01 | 00:00:00.110321 2014-12-15 02:00:01.566731+01 | 00:00:00.290281 2014-12-15 02:10:01.236574+01 | 00:01:15.954532 2014-12-15 02:20:01.440259+01 | 00:00:00.471677 2014-12-15 02:30:01.5733+01 | 00:00:00.208574 2014-12-15 02:40:01.662425+01 | 00:00:00.591091 2014-12-15 02:50:01.263385+01 | 00:00:00.050648 2014-12-15 03:00:01.340777+01 | 00:00:00.289115 2014-12-15 03:10:01.993079+01 | 00:00:00.790201 2014-12-15 03:20:01.061826+01 | 00:00:00.043176 2014-12-15 03:30:01.125639+01 | 00:00:00.172924 2014-12-15 03:40:01.252033+01 | 00:03:05.113579 2014-12-15 03:50:01.362396+01 | 00:00:00.254974 2014-12-15 04:00:01.370922+01 | 00:00:00.208254 2014-12-15 04:10:01.472816+01 | 00:00:00.077214 2014-12-15 04:20:01.553443+01 | 00:00:00.135887 2014-12-15 04:30:01.63607+01 | 00:00:00.027272 2014-12-15 04:40:01.696442+01 | 00:00:00.130954 2014-12-15 04:50:01.786961+01 | 00:00:00.572573 2014-12-15 05:00:01.790753+01 | 00:00:00.491799 2014-12-15 05:10:01.078332+01 | 00:07:58.438202 **** likely autovacuum 2014-12-15 05:20:01.139541+01 | 00:00:00.057486 2014-12-15 05:30:01.251079+01 | 00:00:00.053462 2014-12-15 05:40:01.322349+01 | 00:00:00.084701 2014-12-15 05:50:01.607937+01 | 00:00:00.205241 2014-12-15 06:00:01.699406+01 | 00:00:00.121415 2014-12-15 06:10:01.756047+01 | 00:00:00.20769 2014-12-15 06:20:01.854222+01 | 00:00:00.03397 2014-12-15 06:30:02.041054+01 | 00:03:07.271295 2014-12-15 06:40:01.891882+01 | 00:00:00.263748 2014-12-15 06:50:01.987809+01 | 00:00:00.155619 2014-12-15 07:00:01.068556+01 | 00:00:00.119866 2014-12-15 07:10:01.318478+01 | 00:00:00.092856 2014-12-15 07:20:01.704899+01 | 00:00:00.106533 2014-12-15 07:30:01.773268+01 | 00:00:00.135743 2014-12-15 07:40:01.730152+01 | 00:00:00.06358 2014-12-15 07:50:01.798179+01 | 00:00:00.529685 2014-12-15 08:00:01.868205+01 | 00:00:00.194482 2014-12-15 08:10:01.219339+01 | 00:00:00.063553 2014-12-15 08:20:01.309426+01 | 00:00:00.056698 2014-12-15 08:30:01.120431+01 | 00:00:00.425596 2014-12-15 08:40:01.201882+01 | 00:00:00.00909 2014-12-15 08:50:01.272526+01 | 00:00:00.019492 2014-12-15 09:00:01.361022+01 | 00:00:00.423997 2014-12-15 09:10:01.603702+01 | 00:00:00.066705 2014-12-15 09:20:01.682277+01 | 00:00:09.251202 2014-12-15 09:30:01.934477+01 | 00:00:00.311553 2014-12-15 09:40:02.03221+01 | 00:00:00.125678 2014-12-15 09:50:01.105372+01 | 00:00:00.294006 2014-12-15 10:00:01.201109+01 | 00:00:00.014641 2014-12-15 10:10:01.164478+01 | 00:01:51.375378 2014-12-15 10:20:01.264589+01 | 00:09:54.476361 **** likely autovacuum 2014-12-15 10:30:01.351103+01 | 00:00:00.213636 2014-12-15 10:40:01.623903+01 | 00:00:00.488103 2014-12-15 10:50:01.768132+01 | 00:00:00.080799 2014-12-15 11:00:01.880247+01 | 00:00:20.401738 2014-12-15 11:10:01.215509+01 | 00:00:00.036288 2014-12-15 11:20:01.265607+01 | 00:00:00.057142 2014-12-15 11:30:01.343731+01 | 00:00:00.036609 2014-12-15 11:40:01.41248+01 | 00:00:00.218139 2014-12-15 11:50:01.48113+01 | 00:00:00.242754 2014-12-15 12:00:01.685114+01 | 00:00:00.82528 2014-12-15 12:10:01.995243+01 | 00:02:29.971448 2014-12-15 12:20:01.962833+01 | 00:00:00.118112 2014-12-15 12:30:01.100587+01 | 00:00:00.214437 2014-12-15 12:40:01.226111+01 | 00:00:00.052599 2014-12-15 12:50:01.300061+01 | 00:00:00.162205 2014-12-15 13:00:01.4007+01 | 00:00:00.707891 2014-12-15 13:10:02.005526+01 | 00:00:00.162238 2014-12-15 13:20:01.072375+01 | 00:00:00.214978 2014-12-15 13:30:01.446005+01 | 00:00:00.121816 2014-12-15 13:40:01.483524+01 | 00:00:00.650178 2014-12-15 13:50:01.796143+01 | 00:00:00.065482 2014-12-15 14:00:01.886071+01 | 00:00:00.237577 2014-12-15 14:10:01.134148+01 | 00:00:00.193941 2014-12-15 14:20:01.199047+01 | 00:00:00.068058 2014-12-15 14:30:01.27777+01 | 00:00:00.022991 2014-12-15 14:40:01.361959+01 | 00:00:00.439753 2014-12-15 14:50:01.421515+01 | 00:00:00.037749 2014-12-15 15:00:01.500559+01 | 00:00:00.174448 2014-12-15 15:10:01.811804+01 | 00:06:09.196648 **** likely autovacuum .. It goes up till a maximum of 25 minutes (for the last two weeks), it looks correlated with an autovacuum at the master in one of the big tables of the schema. It happens at about 5hours interval. Is there a way to avoid this ? Should I ask to the master db dba to try to have a more active autovacuum policy? Thank you, Ruben -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance