much slower query in production

Guillaume Cottenceau <gc@xxxxxx> · Wed, 26 Feb 2020 17:17:21 +0100

Dear all,

I am facing a much, much slower query in production than on my
development computer using a restored production backup, and I
don't understand why nor I see what I could do to speedup the
query on production :/

Short data model explanation: one table stores tickets, one table
stores multicards; tickets can belong to a multicard through a
many-to-one relationship, or be independant from multicards. The
query I am interested in, will be used to update a new column in
the multicards table, containing the count of related tickets.

On production:

# EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE multicard_uid = multicards.uid) from multicards;
                                                                                   QUERY PLAN                                                                                    
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on multicards  (cost=0.00..1455177.30 rows=204548 width=12) (actual time=0.178..1694987.355 rows=204548 loops=1)
   SubPlan 1
     ->  Aggregate  (cost=7.07..7.08 rows=1 width=8) (actual time=8.283..8.283 rows=1 loops=204548)
           ->  Index Only Scan using tickets_multicard_uid on tickets  (cost=0.43..7.05 rows=9 width=0) (actual time=1.350..8.280 rows=6 loops=204548)
                 Index Cond: (multicard_uid = multicards.uid)
                 Heap Fetches: 1174940
 Planning Time: 1.220 ms
 Execution Time: 1695029.673 ms

iostat in the middle of this execution:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  1 479548 143652  79412 5609292    0    0  1986   116  999 1115  2  2 71 25
 3  1 479548 134312  79412 5614564    0    0  2386   128 2832 3973 20  2 49 28
 0  1 479548 142520  79428 5617708    0    0  1584   232 3440 3716 11  3 58 29
 0  1 479548 161184  79024 5597756    0    0  1922   144 1249 1562  1  2 70 27
 0  1 479548 161244  79048 5600804    0    0  1556   117 2138 3035  6  2 68 25
 0  2 479548 158384  79048 5604008    0    0  1388   402 2970 4320  6  2 66 27

On my development computer:

# EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE multicard_uid = multicards.uid) from multicards;
                                                                                   QUERY PLAN                                                                                    
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on multicards  (cost=0.00..538974.69 rows=204548 width=12) (actual time=0.055..451.691 rows=204548 loops=1)
   SubPlan 1
     ->  Aggregate  (cost=2.57..2.58 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=204548)
           ->  Index Only Scan using tickets_multicard_uid on tickets  (cost=0.43..2.56 rows=7 width=0) (actual time=0.001..0.001 rows=6 loops=204548)
                 Index Cond: (multicard_uid = multicards.uid)
                 Heap Fetches: 0
 Planning Time: 0.296 ms
 Execution Time: 456.677 ms

The execution time ratio is a huge 3700. I guess the Heap Fetches
difference is the most meaningful here; my understanding would
be, that the index would easily fit in the shared_buffers after a
few subselects, as configuration is 2GB shared_buffers and here's
the index on disk size:

# SELECT relname as index, reltuples as "rows estimate", pg_size_pretty(pg_table_size(quote_ident(relname))) as "on disk size" FROM pg_class, pg_namespace WHERE pg_namespace.oid = pg_class.relnamespace AND relkind = 'i' AND nspname = 'public' AND relname = 'tickets_multicard_uid';
        index          | rows estimate | on disk size 
-----------------------+---------------+--------------
 tickets_multicard_uid |    7.2136e+06 | 161 MB

Though it's not too clear for me what "heap fetches" are. It
seems it might be the actual table data fetches (e.g. not an
index fetch), but I don't really know why so many of them are
needed here, and how to reduce that (if that's the explanation
for the longer timing).

It's true that production has constant activity, but not so much
of it, load avg is typically about 0.5, most queries run very
quickly, pg_stat_activity frequently reports no currently running
query; here's iostat during normal production activity, for
reference:

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0 480712 151112  10144 5664784    1    1   310    69    0    0  9  1 88  3
 0  0 480712 151312  10152 5664876    0    0     0   248 4911 7072  9  3 87  0
 0  0 480720 165488  10072 5642564    0    4    76   210 2019 2685  6  1 92  1
 0  0 480720 165332  10088 5642508    0    0     0   221 3535 17545 25  4 70  1
 0  0 480720 144772  10108 5643324    0    0    84   378 3833 5096 11  2 80  7
 0  0 480720 143300  10116 5644144    0    0    42   298 3446 4784  6  1 92  1
 0  0 480720 143300  10136 5644256    0    0    10  2340 1073 1496  1  1 96  2

Here's also a second susequent run on production for comparison:

# EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE multicard_uid = multicards.uid) from multicards;
                                                                                   QUERY PLAN                                                                                    
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on multicards  (cost=0.00..1455177.30 rows=204548 width=12) (actual time=0.101..834176.389 rows=204548 loops=1)
   SubPlan 1
     ->  Aggregate  (cost=7.07..7.08 rows=1 width=8) (actual time=4.075..4.075 rows=1 loops=204548)
           ->  Index Only Scan using tickets_multicard_uid on tickets  (cost=0.43..7.05 rows=9 width=0) (actual time=0.624..4.072 rows=6 loops=204548)
                 Index Cond: (multicard_uid = multicards.uid)
                 Heap Fetches: 1174941
 Planning Time: 0.273 ms
 Execution Time: 834209.323 ms

Heap fetches still almost the same, albeit timing divided by two.

Both environments are running postgresql 11.5, with 2GB
shared_buffers. Differences I can think of: production is using
ext4 on drbd on SATA and linux 3.2, dev is using ext4 (no drbd)
on SSD and linux 4.15. I can't believe SSD would explain the
difference alone? If positive, then I know what we should do on
production..

Thanks for any hints/help!

-- 
Guillaume Cottenceau