On Sat, Nov 8, 2014 at 2:11 PM, Ruben Domingo Gaspar Aparicio wrote: > The slave (I don't have control on the master) is using 2 NFS file systems, > one for WALs and another one for the data, on Netapp controllers: > > dbnasg401-12a:/vol/dodpupdbtst02 on /ORA/dbs02/PUPDBTST type nfs > (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600) > > dbnasg403-12a:/vol/dodpupdbtst03 on /ORA/dbs03/PUPDBTST type nfs > (rw,remount,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,timeo=600) You should use noatime to avoid unnecessary IO. > The master produces quite a lot of WALs. This is what I get on the slave > (number of WAL files, date-hour, Total size in MB), so per day is more than > 400GB: > I tried to play with how the IO is handled, making it less strict setting > synchronous_commit and fsync to off with not much success. > > I have also done a second test increasing shared_buffers from 12GB to 24GB > (we are running on a 48GB, 8 cores server). > Please let me know if you can see something obvious I am missing. Your IO system needs to be able to deliver sustained IO bandwith at least as large as you need to read and write all the changes. What raw IO bandwidth do those NFS file systems deliver _long term_? I am not talking about spikes because there are buffers. I am talking about the minimum of network throughput on one hand and raw disk IO those boxes can do on the other hand. Then, how much of it is available to your slave? Did you do the math to ensure that the IO bandwidth you have available on the slave is at least as high as what is needed? Note that it's not simply the WAL size that needs to be written and read but also data pages. Kind regards robert -- [guy, jim].each {|him| remember.him do |as, often| as.you_can - without end} http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance