Re: Performances issues with SSD volume ?

Thomas SIMON <tsimon@xxxxxxxxxxx> · Thu, 21 May 2015 15:52:43 +0200

Le 21/05/2015 14:57, Glyn Astill a écrit :
From: Thomas SIMON <tsimon@xxxxxxxxxxx>
To: Glyn Astill <glynastill@xxxxxxxxxxx>
Cc: "pgsql-admin@xxxxxxxxxxxxxx" <pgsql-admin@xxxxxxxxxxxxxx>
Sent: Thursday, 21 May 2015, 13:12
Subject: Re:  Performances issues with SSD volume ?

Le 20/05/2015 18:50, Glyn Astill a écrit :
  From: Thomas SIMON <tsimon@xxxxxxxxxxx>
  To: glynastill@xxxxxxxxxxx
  Cc: "pgsql-admin@xxxxxxxxxxxxxx"
<pgsql-admin@xxxxxxxxxxxxxx>
  Sent: Wednesday, 20 May 2015, 16:41
  Subject: Re:  Performances issues with SSD volume ?

  Hi Glyn,

  I'll try to answer this points.

  I've made some benchs, and indeed 3.2 not helping. not helping at
all.
  I changed to 3.14 and gap is quite big !
  With pgbench RW test, 3.2 --> 4200 TPS ; 3.14 --> 6900 TPS in
same
  conditions
  With pgbench RO test, 3.2 --> 37000 TPS ; 3.14 --> 95000 TPS,
same
  conditions too.
  That's a start then.

  It should so be better, but when server was in production, and ever
with
  bad kernel, performances was already quite good before they quickly
  decreased.
  So i think too I have another configuration problem.

  You say you're IO bound, so some output from sar / iostat / dstat
and
  pg_stat_activity etc before and during the issue would be of use.

  -> My server is not in production right now, so it is difficult to
  replay production load and have some useful metrics.
  The best way I've found is to replay trafic from logs with
pgreplay.
  I hoped that the server falls back by replaying this traffic, but it
  never happens ... Another thing I can't understand ...

  Below is my dstat output when I replay this traffic (and so when server
  runs normally)
  I have unfortunately no more outputs when server's performances
decreased.
  It's a shame we can't get any insight into activity on the server
during the issues.
  Other things you asked

        System memory size : 256 Go
        SSD Model numbers and how many : 4 SSd disks ; RAID 10 ; model
  INTEL SSDSC2BB480G4
        Raid controller : MegaRAID SAS 2208
        Partition alignments and stripe sizes : see fdisk delow
        Kernel options : the config file is here :

ftp://ftp.ovh.net/made-in-ovh/bzImage/3.14.43/config-3.14.43-xxxx-std-ipv6-64
        Filesystem used and mount options : ext4, see mtab below
        IO Scheduler : noop [deadline] cfq for my ssd raid volume
        Postgresql version and configuration : 9.3.5

  max_connections=1800
  shared_buffers=8GB
  temp_buffers=32MB
  work_mem=100MB
  maintenance_work_mem=12GB
  bgwriter_lru_maxpages=200
  effective_io_concurrency=4
  wal_level=hot_standby
  wal_sync_method=fdatasync
  wal_writer_delay=2000ms
  commit_delay=1000
  checkpoint_segments=80
  checkpoint_timeout=15min
  checkpoint_completion_target=0.7
  archive_command='rsync ....'
  max_wal_senders=10
  wal_keep_segments=38600
  vacuum_defer_cleanup_age=100
  hot_standby = on
  max_standby_archive_delay = 5min
  max_standby_streaming_delay = 5min
  hot_standby_feedback = on
  random_page_cost = 1.0
  effective_cache_size = 240GB
  log_min_error_statement = warning
  log_min_duration_statement = 0
  log_checkpoints = on
  log_connections = on
  log_disconnections = on
  log_line_prefix = '%m|%u|%d|%c|'
  log_lock_waits = on
  log_statement = 'all'
  log_timezone = 'localtime'
  track_activities = on
  track_functions = pl
  track_activity_query_size = 8192
  autovacuum_max_workers = 5
  autovacuum_naptime = 30s
  autovacuum_vacuum_threshold = 40
  autovacuum_analyze_threshold = 20
  autovacuum_vacuum_scale_factor = 0.10
  autovacuum_analyze_scale_factor = 0.10
  autovacuum_vacuum_cost_delay = 5ms
  default_transaction_isolation = 'read committed'
  max_locks_per_transaction = 128

        Connection pool sizing (pgpool2)
  num_init_children = 1790
  max_pool = 1
  1800 is quite a lot of connections, and with max_pool=1 in pgpool
you're effectively just using pgpool as a proxy (as I recall, my memory is a
little fuzzy on pgpool now).  Unless your app is stateful in some way or has
unique users for each of those 1800 connections you should lower the quantity of
active connections.  A general starting point is usually cpu cores * 2, so you
could up max_pool and divide num_init_children by the same amount.
  Hard to say what you need to do without knowing what exactly you're
doing though.  What's the nature of the app(s)?
Yes, we just use it as a proxy for now.
We have approximately 100 different active users, doing for all of then
various number of connexions (twisted + zope apps)
result is ~ 900 idle connexions for ~ 60 active connexions, but
sometimes (when stopping/starting prod), we need almost double of
connexion because some twisted services don't stop their connexions
immediatly.
But this is the actual (working) configuration, and I don't think think
my performance disk is related to this.
I think at this point you could do with going back and trying to reproduce the issue, then trace back up to pg_stat_activity to see what activity could be causing the disk i/o.  I assume you've tried to reproduce the disk issues with a simple disk benchmark like bonnie++?
Yes, I think the same thing. Probably I will doing this tomorrow early 
in the morning.
I tried to reproduce disk issues with different stress tests like 
bonnie, fio, tsung, and I use a more realistic scenario with pgreplay to 
reproduce my production trafic from postgresql logfile.
However, I'm note sure how to diagnostic performance issues.
I mean, if I see ssd are 100% full, how can I figure out why their 
behavior changes ?

I'm asking myself another question, about master/slave configuration.
For doing my test, I will put my ssd server as slave of hdd server. 
After that, I will promote him as master.
In case I still have performance issues and I must do a rollback, am I 
necessarily forced to reconstruct completely my new slave (hdd) with 
pg_basebackup (and wait some hours file are transferer), or can I 
promote directly this old master as a slave without pending time to 
reconstruct (as files should be the same on both servers) ?

Thanks
Thomas

--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin