On 10/12/2015 12:06 PM, Lele Gaifax wrote:
Hi all, I'm doing some experiments to find the better layout for reimplementing an existing db (MySQL cough!) with PostgreSQL 9.4+. I noticed a strange plan coming out from a simple query joining two tables, both containing 10Mrecs (and both ANALYZEd): l10ntest=# \d master; Table "public.master" Column | Type | Modifiers --------+---------+------------------------------------------------------ num | integer | not null default nextval('master_num_seq'::regclass) Indexes: "master_pkey" PRIMARY KEY, btree (num) l10ntest=# \d master_l10n; Table "public.master_l10n" Column | Type | Modifiers --------+----------------------+----------------------------------------------------------- num | integer | not null default nextval('master_l10n_num_seq'::regclass) lang | character varying(2) | not null text | text | Indexes: "master_l10n_pkey" PRIMARY KEY, btree (num, lang) "l10n_text_index" btree (lower(text) text_pattern_ops) l10ntest=# EXPLAIN SELECT count(l.num) AS count_1 FROM master_l10n l WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%'; QUERY PLAN ------------------------------------------------------------------------------------------------------ Aggregate (cost=309315.38..309315.39 rows=1 width=4) -> Bitmap Heap Scan on master_l10n l (cost=64700.56..307801.65 rows=605492 width=4) Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~ 'quattro%'::text)) -> Bitmap Index Scan on l10n_text_index (cost=0.00..64549.19 rows=999662 width=0) Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND (lower(text) ~<~ 'quattrp'::text)) (5 rows) Time: 1.665 ms l10ntest=# EXPLAIN SELECT count(m.num) AS count_1 FROM master_l10n l JOIN master m ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%'; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Aggregate (cost=676558.14..676558.15 rows=1 width=4) -> Hash Join (cost=373011.02..675044.41 rows=605492 width=4) Hash Cond: (l.num = m.num) -> Bitmap Heap Scan on master_l10n l (cost=64700.56..307801.65 rows=605492 width=4) Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~ 'quattro%'::text)) -> Bitmap Index Scan on l10n_text_index (cost=0.00..64549.19 rows=999662 width=0) Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND (lower(text) ~<~ 'quattrp'::text)) -> Hash (cost=144247.76..144247.76 rows=9999976 width=4) -> Seq Scan on master m (cost=0.00..144247.76 rows=9999976 width=4) (9 rows) Time: 1.244 ms l10ntest=# SELECT count(l.num) AS count_1 FROM master_l10n l WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%'; count_1 --------- 1101101 (1 row) Time: 1221.941 ms l10ntest=# SELECT count(m.num) AS count_1 FROM master_l10n l JOIN master m ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%'; count_1 --------- 1101101 (1 row) Time: 3541.852 ms Why does the join on the master table require a "Seq Scan on master"? I tried different kinds of "JOIN", but the resulting explanation remains the same. Am I missing something, or should I stop worrying about that sequential scan?
Off hand I would say it is because of this --> count(m.num). Try count(l.num) instead and see what happens. As your queries above show they are the same number.
Thanks in advance, bye, lele.
-- Adrian Klaver adrian.klaver@xxxxxxxxxxx -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general