Re: Large memory machine and PG 9.2.9

Lacey Powers <lacey.leanne@xxxxxxxxx> · Tue, 24 Feb 2015 15:09:19 -0800

On 02/24/2015 12:19 PM, jesper@xxxxxxxx wrote:
Hi.

We have just moved our 9.2.9 instance onto new beefy iron. The machine has
48 Intel cores and 3TB of memory. Running linux 3.13.0-43 (Ubuntu 12.04).

The problem is a bit hard to describe, but I do suspect it is related to
the large memory and probably kernel og pg-kernel interaction.

Performance is nice up and until the point where the memory is full at
which point sluggish behaviour comes up. sar -B output is.

17:30:01     29513.79 140026.51 634164.09      0.00 160093.42      0.00
   0.00      0.00      0.00
17:35:01     31351.13 154801.18 638880.17      0.12 184323.22      0.00
   0.00      0.00      0.00
17:40:01     38269.69 128701.23 652375.35      0.23 176369.40      0.00
   0.00      0.00      0.00
17:45:01     34834.14 135371.82 627765.26      0.00 169779.58      0.00
   0.00      0.00      0.00
17:50:01     34039.17 134630.64 627500.30      0.00 174259.04      0.00
   0.00      0.00      0.00
17:55:01     30318.70 150791.13 612534.75      0.00 163425.51      0.00
   0.00      0.00      0.00
18:05:01     28446.80 122103.38 549891.01      0.26 141756.52      0.00
612.77    556.60     90.83
18:10:01     12944.16  39222.64 332848.27      4.27  82317.48      0.00
3037.11   2725.14     89.73
18:15:01     12955.10  47841.51 453714.33      3.95 106397.71   1018.39
4811.64   5421.81     93.00
18:20:01     16393.43  64063.10 548341.21      2.48 149489.93   6447.13
2238.06   8537.87     98.30
18:25:01     15725.89  59096.20 502043.56      0.27 152932.96   5197.78
2783.81   7782.02     97.50
18:30:01     12735.95  50460.08 394507.90      0.09 143488.71   4645.20
2141.35   6621.27     97.56
18:35:01     11995.37  52743.57 414669.31      0.02 134363.87   5096.32
1708.52   6668.57     98.00
18:40:01     11448.30  43185.84 373441.27      0.35 109712.93   3247.41
1772.93   4902.79     97.66
18:45:01     10959.95  44993.48 402033.19      0.04 115914.63   3157.24
2393.26   5378.58     96.90
18:50:01     11270.25  50853.00 431117.15      0.30 105697.26   3951.30
1951.31   5778.41     97.90
18:55:01     10086.69  59206.44 362027.12      0.70 104760.65   6928.91
1684.04   8497.73     98.66

Sluggish'ness starts at 18:10'ish and continues. Load drops and IO is very
small..

All good suggestions welcome.

sysctl changes
$ grep '^vm' /etc/sysctl.conf
vm.swappiness = 0
vm.dirty_background_ratio = 3
vm.dirty_ratio = 15
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100

Thanks

Jesper

Hello Jesper,

At first glance, looking at your sysctl config and noting that you have 
48 cores and 3TB of RAM, you might consider setting vm.dirty_bytes and 
vm.dirty_background_bytes instead of vm.dirty_ratio, and 
vm.dirty_background_ratio.

Your current settings start background writing at 92GB of RAM, and 
forcingIO to be synchronousat 420GB of RAM, which is crazy amount of 
data to push onto a controller or disks. =( Even if you set 
dirty_background_ratio to 1% and dirty_ratio to 2%, the numbers would 
still be 31GB and 62GB respectively, so something lower than 1% seems 
most useful, which is why you have the dirty_bytes and 
dirty_background_bytes controls available.

Capturing the output from /proc/meminfo (in a while loop or with watch) 
would also be useful for checking your hypothesis regarding the large 
RAM, during the stalls you note. If the stalls start when writeback 
reaches about 420GB (with your current settings), that should let you 
know that the stalls are from writing back all that dirty RAM.Otherwise, 
you'll probably need to look elsewhere.

I would keep the vm.dirty_bytes and vm.dirty_background_bytes lower than 
the amount of cache on your raid controller, maybe 50% and 25% of the 
total size, respectively? That should be a reasonable starting point for 
testing, and you can adjust the values up and down as needed to get the 
performance you're looking for.

Hope this is helpful. =)

Regards,

Lacey

--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin