Dear all, I am facing a much, much slower query in production than on my development computer using a restored production backup, and I don't understand why nor I see what I could do to speedup the query on production :/ Short data model explanation: one table stores tickets, one table stores multicards; tickets can belong to a multicard through a many-to-one relationship, or be independant from multicards. The query I am interested in, will be used to update a new column in the multicards table, containing the count of related tickets. On production: # EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE multicard_uid = multicards.uid) from multicards; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on multicards (cost=0.00..1455177.30 rows=204548 width=12) (actual time=0.178..1694987.355 rows=204548 loops=1) SubPlan 1 -> Aggregate (cost=7.07..7.08 rows=1 width=8) (actual time=8.283..8.283 rows=1 loops=204548) -> Index Only Scan using tickets_multicard_uid on tickets (cost=0.43..7.05 rows=9 width=0) (actual time=1.350..8.280 rows=6 loops=204548) Index Cond: (multicard_uid = multicards.uid) Heap Fetches: 1174940 Planning Time: 1.220 ms Execution Time: 1695029.673 ms iostat in the middle of this execution: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 1 479548 143652 79412 5609292 0 0 1986 116 999 1115 2 2 71 25 3 1 479548 134312 79412 5614564 0 0 2386 128 2832 3973 20 2 49 28 0 1 479548 142520 79428 5617708 0 0 1584 232 3440 3716 11 3 58 29 0 1 479548 161184 79024 5597756 0 0 1922 144 1249 1562 1 2 70 27 0 1 479548 161244 79048 5600804 0 0 1556 117 2138 3035 6 2 68 25 0 2 479548 158384 79048 5604008 0 0 1388 402 2970 4320 6 2 66 27 On my development computer: # EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE multicard_uid = multicards.uid) from multicards; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on multicards (cost=0.00..538974.69 rows=204548 width=12) (actual time=0.055..451.691 rows=204548 loops=1) SubPlan 1 -> Aggregate (cost=2.57..2.58 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=204548) -> Index Only Scan using tickets_multicard_uid on tickets (cost=0.43..2.56 rows=7 width=0) (actual time=0.001..0.001 rows=6 loops=204548) Index Cond: (multicard_uid = multicards.uid) Heap Fetches: 0 Planning Time: 0.296 ms Execution Time: 456.677 ms The execution time ratio is a huge 3700. I guess the Heap Fetches difference is the most meaningful here; my understanding would be, that the index would easily fit in the shared_buffers after a few subselects, as configuration is 2GB shared_buffers and here's the index on disk size: # SELECT relname as index, reltuples as "rows estimate", pg_size_pretty(pg_table_size(quote_ident(relname))) as "on disk size" FROM pg_class, pg_namespace WHERE pg_namespace.oid = pg_class.relnamespace AND relkind = 'i' AND nspname = 'public' AND relname = 'tickets_multicard_uid'; index | rows estimate | on disk size -----------------------+---------------+-------------- tickets_multicard_uid | 7.2136e+06 | 161 MB Though it's not too clear for me what "heap fetches" are. It seems it might be the actual table data fetches (e.g. not an index fetch), but I don't really know why so many of them are needed here, and how to reduce that (if that's the explanation for the longer timing). It's true that production has constant activity, but not so much of it, load avg is typically about 0.5, most queries run very quickly, pg_stat_activity frequently reports no currently running query; here's iostat during normal production activity, for reference: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 480712 151112 10144 5664784 1 1 310 69 0 0 9 1 88 3 0 0 480712 151312 10152 5664876 0 0 0 248 4911 7072 9 3 87 0 0 0 480720 165488 10072 5642564 0 4 76 210 2019 2685 6 1 92 1 0 0 480720 165332 10088 5642508 0 0 0 221 3535 17545 25 4 70 1 0 0 480720 144772 10108 5643324 0 0 84 378 3833 5096 11 2 80 7 0 0 480720 143300 10116 5644144 0 0 42 298 3446 4784 6 1 92 1 0 0 480720 143300 10136 5644256 0 0 10 2340 1073 1496 1 1 96 2 Here's also a second susequent run on production for comparison: # EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE multicard_uid = multicards.uid) from multicards; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on multicards (cost=0.00..1455177.30 rows=204548 width=12) (actual time=0.101..834176.389 rows=204548 loops=1) SubPlan 1 -> Aggregate (cost=7.07..7.08 rows=1 width=8) (actual time=4.075..4.075 rows=1 loops=204548) -> Index Only Scan using tickets_multicard_uid on tickets (cost=0.43..7.05 rows=9 width=0) (actual time=0.624..4.072 rows=6 loops=204548) Index Cond: (multicard_uid = multicards.uid) Heap Fetches: 1174941 Planning Time: 0.273 ms Execution Time: 834209.323 ms Heap fetches still almost the same, albeit timing divided by two. Both environments are running postgresql 11.5, with 2GB shared_buffers. Differences I can think of: production is using ext4 on drbd on SATA and linux 3.2, dev is using ext4 (no drbd) on SSD and linux 4.15. I can't believe SSD would explain the difference alone? If positive, then I know what we should do on production.. Thanks for any hints/help! -- Guillaume Cottenceau