Re: PostgreSQL on ZFS: performance tuning

Tomas Vondra <tomas.vondra@xxxxxxxxxxxxxxx> · Fri, 29 Jul 2016 08:30:49 +0200

On 07/29/2016 08:04 AM, trafdev wrote:
Hi.

I have an OLAP-oriented DB (light occasional bulk writes and heavy
aggregated selects over large periods of data) based on Postgres 9.5.3.

Server is a FreeBSD 10.3 with 64GB of RAM and 2x500GB SSD (root on ZFS,
mirror).

The largest table is 13GB (with a 4GB index on it), other tables are 4,
2 and less than 1GB.

After reading a lot of articles and "howto-s" I've collected following
set of tweaks and hints:

ZFS pools creation:
zfs create zroot/ara/sqldb
zfs create -o recordsize=8k -o primarycache=all zroot/ara/sqldb/pgsql

zfs get primarycache,recordsize,logbias,compression zroot/ara/sqldb/pgsql
NAME                   PROPERTY      VALUE         SOURCE
zroot/ara/sqldb/pgsql  primarycache  all           local
zroot/ara/sqldb/pgsql  recordsize    8K            local
zroot/ara/sqldb/pgsql  logbias       latency       local
zroot/ara/sqldb/pgsql  compression   lz4           inherited from zroot

L2ARC is disabled
VDEV cache is disabled

pgsql -c "mkdir /ara/sqldb/pgsql/data_ix"
pgsql -c "initdb --locale=en_US.UTF-8 -E UTF-8 -D /ara/sqldb/pgsql/data"

/etc/sysctl.conf
vfs.zfs.metaslab.lba_weighting_enabled=0

postgresql.conf:
listen_addresses = '*'
max_connections = 100
shared_buffers = 16GB
effective_cache_size = 48GB

It may not be a problem for your workload, but this effective_cache_size 
value is far too high.

work_mem = 500MB
maintenance_work_mem = 2GB
min_wal_size = 4GB
max_wal_size = 8GB
checkpoint_completion_target = 0.9

You probably need to increase the checkpoint_timeout too.

wal_buffers = 16MB
default_statistics_target = 500
random_page_cost = 1
log_lock_waits = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_destination = 'csvlog'
logging_collector = on
log_min_duration_statement = 10000
shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 10000
track_io_timing = on

zfs-stats -A
------------------------------------------------------------------------
ZFS Subsystem Report                Thu Jul 28 21:58:46 2016
------------------------------------------------------------------------
ARC Summary: (HEALTHY)
    Memory Throttle Count:            0
ARC Misc:
    Deleted:                14.92b
    Recycle Misses:                7.01m
    Mutex Misses:                4.72m
    Evict Skips:                1.28b
ARC Size:                53.27%    32.59    GiB
    Target Size: (Adaptive)        53.28%    32.60    GiB
    Min Size (Hard Limit):        12.50%    7.65    GiB
    Max Size (High Water):        8:1    61.18    GiB
ARC Size Breakdown:
    Recently Used Cache Size:    92.83%    30.26    GiB
    Frequently Used Cache Size:    7.17%    2.34    GiB
ARC Hash Breakdown:
    Elements Max:                10.36m
    Elements Current:        78.09%    8.09m
    Collisions:                9.63b
    Chain Max:                26
    Chains:                    1.49m
------------------------------------------------------------------------

zfs-stats -E
------------------------------------------------------------------------
ZFS Subsystem Report                Thu Jul 28 21:59:57 2016
------------------------------------------------------------------------
ARC Efficiency:                    49.85b
    Cache Hit Ratio:        70.94%    35.36b
    Cache Miss Ratio:        29.06%    14.49b
    Actual Hit Ratio:        66.32%    33.06b
    Data Demand Efficiency:        84.85%    25.39b
    Data Prefetch Efficiency:    17.85%    12.90b
    CACHE HITS BY CACHE LIST:
      Anonymously Used:        4.10%    1.45b
      Most Recently Used:        37.82%    13.37b
      Most Frequently Used:        55.67%    19.68b
      Most Recently Used Ghost:    0.58%    203.42m
      Most Frequently Used Ghost:    1.84%    649.83m
    CACHE HITS BY DATA TYPE:
      Demand Data:            60.92%    21.54b
      Prefetch Data:        6.51%    2.30b
      Demand Metadata:        32.56%    11.51b
      Prefetch Metadata:        0.00%    358.22k
    CACHE MISSES BY DATA TYPE:
      Demand Data:            26.55%    3.85b
      Prefetch Data:        73.13%    10.59b
      Demand Metadata:        0.31%    44.95m
      Prefetch Metadata:        0.00%    350.48k

zfs-stats -Z
------------------------------------------------------------------------
ZFS Subsystem Report                Thu Jul 28 22:02:46 2016
------------------------------------------------------------------------
File-Level Prefetch: (HEALTHY)
DMU Efficiency:                    49.97b
    Hit Ratio:            55.85%    27.90b
    Miss Ratio:            44.15%    22.06b
    Colinear:                22.06b
      Hit Ratio:            0.04%    7.93m
      Miss Ratio:            99.96%    22.05b
    Stride:                    17.85b
      Hit Ratio:            99.61%    17.78b
      Miss Ratio:            0.39%    69.46m
DMU Misc:
    Reclaim:                22.05b
      Successes:            0.05%    10.53m
      Failures:            99.95%    22.04b
    Streams:                10.14b
      +Resets:            0.10%    9.97m
      -Resets:            99.90%    10.13b
      Bogus:                0

Notes\concerns:

- primarycache=metadata (recommended in most articles) produces a
significant performance degradation (in SELECT queries);

Those articles are wrong. PostgreSQL relies of filesystem cache, so it 
needs primarycache=all.

- from what I can see, Postgres uses memory too carefully. I would like
somehow to force it to keep accessed data in memory as long as possible.
Instead I often see that even frequently accessed data is pushed out of
memory cache for no apparent reasons.
>

This is probably a consequence of the primarycache misconfiguration.

Do I miss something important in my configs? Are there any double
writes\reads somewhere because of OS\ZFS\Postgres caches? How to avoid
them?

Please share your experience\tips. Thanks.

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance