Hi,
I found a strange case when the query (which works through large amount of shared buffers) run 2x time faster on the second and subsequent run in new connection to db.
For sample:
postgres@base1:~$ psql ***
psql (9.3.10)
***=# explain (analyze, costs, buffers, timing) select * from transactions where "timestamp" >= '2016-02-20' and "timestamp" < '2016-02-23'::date;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using i_transactions_timestamp on transactions (cost=0.57..138824.52 rows=1955459 width=790) (actual time=0.146..3416.477 rows=1950630 loops=1)
Index Cond: (("timestamp" >= '2016-02-20 00:00:00'::timestamp without time zone) AND ("timestamp" < '2016-02-23'::date))
Buffers: shared hit=1965635
Total runtime: 3481.322 ms
(4 строки)
***=# explain (analyze, costs, buffers, timing) select * from transactions where "timestamp" >= '2016-02-20' and "timestamp" < '2016-02-23'::date;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using i_transactions_timestamp on transactions (cost=0.57..138824.52 rows=1955459 width=790) (actual time=0.030..1812.361 rows=1950630 loops=1)
Index Cond: (("timestamp" >= '2016-02-20 00:00:00'::timestamp without time zone) AND ("timestamp" < '2016-02-23'::date))
Buffers: shared hit=1965635
Total runtime: 1878.503 ms
postgres@base1:~$ psql ***
psql (9.3.10)
***=# explain (analyze, costs, buffers, timing) select * from transactions where "timestamp" >= '2016-02-20' and "timestamp" < '2016-02-23'::date;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using i_transactions_timestamp on transactions (cost=0.57..138824.52 rows=1955459 width=790) (actual time=0.146..3416.477 rows=1950630 loops=1)
Index Cond: (("timestamp" >= '2016-02-20 00:00:00'::timestamp without time zone) AND ("timestamp" < '2016-02-23'::date))
Buffers: shared hit=1965635
Total runtime: 3481.322 ms
(4 строки)
***=# explain (analyze, costs, buffers, timing) select * from transactions where "timestamp" >= '2016-02-20' and "timestamp" < '2016-02-23'::date;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using i_transactions_timestamp on transactions (cost=0.57..138824.52 rows=1955459 width=790) (actual time=0.030..1812.361 rows=1950630 loops=1)
Index Cond: (("timestamp" >= '2016-02-20 00:00:00'::timestamp without time zone) AND ("timestamp" < '2016-02-23'::date))
Buffers: shared hit=1965635
Total runtime: 1878.503 ms
And every run after it - works in 1.8-1.9s, but if I establish the new connection to database - the first query will run 3.5s again.
Time difference and timing of each run pretty repeatable (+/- 100ms).
There are perf report data for the first and for the second runs:
The first run (something fishy with kernel calls):
19,60% postgres [kernel.kallsyms] [k] filemap_map_pages
15,86% postgres postgres [.] hash_search_with_hash_value
8,20% postgres postgres [.] heap_hot_search_buffer
8,20% postgres postgres [.] heap_page_prune_opt
5,72% postgres postgres [.] PinBuffer
4,38% postgres [kernel.kallsyms] [k] page_fault
4,04% postgres [kernel.kallsyms] [k] page_waitqueue
3,55% postgres [kernel.kallsyms] [k] __wake_up_bit
2,95% postgres postgres [.] LWLockAcquire
2,31% postgres [kernel.kallsyms] [k] unlock_page
1,96% postgres [vdso] [.] __vdso_gettimeofday
1,83% postgres [kernel.kallsyms] [k] radix_tree_next_chunk
1,77% postgres [kernel.kallsyms] [k] page_add_file_rmap
1,66% postgres postgres [.] _bt_checkkeys
1,27% postgres postgres [.] LWLockRelease
19,60% postgres [kernel.kallsyms] [k] filemap_map_pages
15,86% postgres postgres [.] hash_search_with_hash_value
8,20% postgres postgres [.] heap_hot_search_buffer
8,20% postgres postgres [.] heap_page_prune_opt
5,72% postgres postgres [.] PinBuffer
4,38% postgres [kernel.kallsyms] [k] page_fault
4,04% postgres [kernel.kallsyms] [k] page_waitqueue
3,55% postgres [kernel.kallsyms] [k] __wake_up_bit
2,95% postgres postgres [.] LWLockAcquire
2,31% postgres [kernel.kallsyms] [k] unlock_page
1,96% postgres [vdso] [.] __vdso_gettimeofday
1,83% postgres [kernel.kallsyms] [k] radix_tree_next_chunk
1,77% postgres [kernel.kallsyms] [k] page_add_file_rmap
1,66% postgres postgres [.] _bt_checkkeys
1,27% postgres postgres [.] LWLockRelease
The second run (look perfectly ok for such kind of query):
27,74% postgres postgres [.] hash_search_with_hash_value
15,51% postgres postgres [.] heap_hot_search_buffer
14,28% postgres postgres [.] heap_page_prune_opt
9,22% postgres postgres [.] PinBuffer
3,94% postgres [vdso] [.] __vdso_gettimeofday
3,32% postgres postgres [.] _bt_checkkeys
3,29% postgres postgres [.] LWLockAcquire
2,86% postgres postgres [.] LWLockRelease
1,54% postgres postgres [.] UnpinBuffer
27,74% postgres postgres [.] hash_search_with_hash_value
15,51% postgres postgres [.] heap_hot_search_buffer
14,28% postgres postgres [.] heap_page_prune_opt
9,22% postgres postgres [.] PinBuffer
3,94% postgres [vdso] [.] __vdso_gettimeofday
3,32% postgres postgres [.] _bt_checkkeys
3,29% postgres postgres [.] LWLockAcquire
2,86% postgres postgres [.] LWLockRelease
1,54% postgres postgres [.] UnpinBuffer
So it's looks like that something strange going inside linux kernel 3.16.0 memory managment (it's vanilla kernel on the bare hardware, no virtualization, swap off).
Question is: it's work as expected (and in that case probably good idea use pgbouncer even for one-shot analytical queries), or it's sign of the potential issues with the ykernel?
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.com/
Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@xxxxxxxxx
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."
Senior Postgresql DBA
http://www.postgresql-consulting.com/
Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@xxxxxxxxx
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."