Hi.
I'm trying to build an OLAP-oriented DB based on PostgresSQL.
User works with a paginated report in the web-browser. Interface allows
to fetch data for a custom date-range selection,
display individual rows (20-50 per page) and totals (for entire
selection, even not visible on the current page) and sorting by any column.
The main goal is to deliver results of the basic SELECT queries to the
end-user in less than 2 seconds.
I was able to achieve that except for one table (the biggest one).
It consist of multiple dimensions (date, gran, aid, pid, sid, fid,
subid) and metrics (see below).
User can filter by any dimension and sort by any metric.
Here is a CREATE script for this table:
CREATE TABLE stats.feed_sub
(
date date NOT NULL,
gran interval NOT NULL,
aid smallint NOT NULL,
pid smallint NOT NULL,
sid smallint NOT NULL,
fid smallint NOT NULL,
subid text NOT NULL,
rev_est_pub real NOT NULL,
rev_est_feed real NOT NULL,
rev_raw real NOT NULL,
c_total bigint NOT NULL,
c_passed bigint NOT NULL,
q_total bigint NOT NULL,
q_passed bigint NOT NULL,
q_filt_geo bigint NOT NULL,
q_filt_browser bigint NOT NULL,
q_filt_os bigint NOT NULL,
q_filt_ip bigint NOT NULL,
q_filt_subid bigint NOT NULL,
q_filt_pause bigint NOT NULL,
q_filt_click_cap_ip bigint NOT NULL,
q_filt_query_cap bigint NOT NULL,
q_filt_click_cap bigint NOT NULL,
q_filt_rev_cap bigint NOT NULL,
q_filt_erpm_floor bigint NOT NULL,
c_filt_click_cap_ip bigint NOT NULL,
c_filt_doubleclick bigint NOT NULL,
c_filt_url_expired bigint NOT NULL,
c_filt_fast_click bigint NOT NULL,
c_filt_delay_clicks bigint NOT NULL,
c_filt_ip_mismatch bigint NOT NULL,
c_filt_ref_mismatch bigint NOT NULL,
c_filt_lng_mismatch bigint NOT NULL,
c_filt_ua_mismatch bigint NOT NULL,
res_impr bigint NOT NULL,
rev_ver_pub real,
rev_ver_feed real,
c_ver bigint,
q_filt_ref bigint NOT NULL
)
WITH (
OIDS=FALSE
);
CREATE INDEX ix_feed_sub_date
ON stats.feed_sub
USING brin
(date);
CREATE UNIQUE INDEX ixu_feed_sub
ON stats.feed_sub
USING btree
(date, gran, aid, pid, sid, fid, subid COLLATE pg_catalog."default");
Here is some sizing info (https://wiki.postgresql.org/wiki/Disk_Usage):
relation,size
stats.feed_sub,5644 MB
stats.ixu_feed_sub,1594 MB
row_estimate
15865627
Here is the typical query (for totals beige):
SELECT
sum(stats.feed_sub.c_filt_click_cap_ip) AS clicks_from_ip,
sum(stats.feed_sub.c_filt_doubleclick) AS clicks_on_target,
sum(stats.feed_sub.c_filt_delay_clicks) AS ip_click_period,
sum(stats.feed_sub.c_filt_fast_click) AS fast_click,
sum(stats.feed_sub.c_filt_ip_mismatch) AS ip_mismatch,
sum(stats.feed_sub.c_filt_lng_mismatch) AS lng_mismatch,
sum(stats.feed_sub.c_filt_ref_mismatch) AS ref_mismatch,
sum(stats.feed_sub.c_filt_ua_mismatch) AS ua_mismatch,
sum(stats.feed_sub.c_filt_url_expired) AS url_expired,
stats.feed_sub.subid AS stats_feed_sub_subid,
stats.feed_sub.sid AS stats_feed_sub_sid
FROM stats.feed_sub
WHERE stats.feed_sub.date >= '2016-06-01' :: TIMESTAMP AND
stats.feed_sub.date <= '2016-06-30' :: TIMESTAMP AND
stats.feed_sub.gran = '1 day'
AND stats.feed_sub.aid = 3
GROUP BY
stats.feed_sub.subid, stats.feed_sub.sid;
QUERY PLAN
HashAggregate (cost=901171.72..912354.97 rows=344100 width=86) (actual
time=7207.825..7335.473 rows=126044 loops=1)
" Group Key: subid, sid"
Buffers: shared hit=3635804
-> Index Scan using ixu_feed_sub on feed_sub (cost=0.56..806544.38
rows=3440994 width=86) (actual time=0.020..3650.208 rows=3578344 loops=1)
Index Cond: ((date >= '2016-06-01 00:00:00'::timestamp without
time zone) AND (date <= '2016-06-30 00:00:00'::timestamp without time
zone) AND (gran = '1 day'::interval) AND (aid = 3))
Buffers: shared hit=3635804
Planning time: 0.150 ms
Execution time: 7352.009 ms
As I can see - it takes 3.6 seconds just for an index scan (which sits
in RAM).
+3 seconds for groupings +1-2 seconds for network transfers, so I'm
completely out of my "sub 2 seconds" goal.
Questions are:
1. Am I using the right DB\architecture for achieving my goal? Are there
any better solution for that?
2. Have I reached some physical limits? Will installing faster RAM\CPU help?
Thanks in advance!
Server config:
OS:
> uname -a
FreeBSD sqldb 10.2-RELEASE-p9 FreeBSD 10.2-RELEASE-p9 #0: Thu Jan 14
01:32:46 UTC 2016
root@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx:/usr/obj/usr/src/sys/GENERIC amd64
CPU: Intel(R) Xeon(R) CPU E5-1630 v3
> sysctl -a | egrep -i 'hw.machine|hw.model|hw.ncpu'
hw.machine: amd64
hw.model: Intel(R) Xeon(R) CPU E5-1630 v3 @ 3.70GHz
hw.ncpu: 8
hw.machine_arch: amd64
MEM: 64GB
> sysctl hw.physmem
hw.physmem: 68572983296
HDD: 2x480GB SSD (ZFS mirror)
> camcontrol devlist
<INTEL SSDSC2BB480H4 D2010380> at scbus5 target 0 lun 0 (ada0,pass1)
<INTEL SSDSC2BB480H4 D2010380> at scbus6 target 0 lun 0 (ada1,pass2)
FS:
> zfs list
NAME USED AVAIL REFER MOUNTPOINT
zroot 36.5G 390G 96K /zroot
...
zroot/ara/sqldb/pgsql 33.7G 390G 33.7G /ara/sqldb/pgsql
> zfs get primarycache,recordsize,logbias,compression zroot/ara/sqldb/pgsql
NAME PROPERTY VALUE SOURCE
zroot/ara/sqldb/pgsql primarycache all local
zroot/ara/sqldb/pgsql recordsize 8K local
zroot/ara/sqldb/pgsql logbias latency local
zroot/ara/sqldb/pgsql compression lz4 inherited from zroot
Misc:
> cat /etc/sysctl.conf
vfs.zfs.metaslab.lba_weighting_enabled=0
Postgres:
> /usr/local/bin/postgres --version
postgres (PostgreSQL) 9.5.3
> cat postgresql.conf:
...
listen_addresses = '*'
max_connections = 100
shared_buffers = 16GB
effective_cache_size = 48GB
work_mem = 500MB
maintenance_work_mem = 2GB
min_wal_size = 4GB
max_wal_size = 8GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
random_page_cost = 1
log_lock_waits = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_destination = 'csvlog'
logging_collector = on
log_min_duration_statement = 10000
shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 10000
track_io_timing = on
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance