So far that one was the worst in terms of cost and time. Here are all the plans with buffers, more records, and results being returned. At this point I have good enough performance with my UNION approach but I'm just trying to learn now. WHY is the union approach the fastest? I would have expected the EXISTS or IN approaches to be faster or at least have the SAME cost? At this point I just want to understand.
dft1fjfv106r48=> explain (analyze, buffers, verbose)
select *
from contacts c
where EXISTS(
(select 1 from phone_numbers p where c.id = p.contact_id and p.national = 5038904993 and p.e164 = '+15038904993')
union
(select 1 id = e.contact_id and e.email = 'robert.xxxx@xxxxxxxxx'));
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.contacts c (cost=0.00..23238.90 rows=1425 width=95) (actual time=2.241..46.817 rows=7 loops=1)
Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id
Filter: (SubPlan 1)
Rows Removed by Filter: 2843
Buffers: shared hit=11497
SubPlan 1
-> Unique (cost=8.13..8.13 rows=2 width=0) (actual time=0.015..0.015 rows=0 loops=2850)
Output: (1)
Buffers: shared hit=11440
-> Sort (cost=8.13..8.13 rows=2 width=0) (actual time=0.013..0.013 rows=0 loops=2850)
Output: (1)
Sort Key: (1)
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=11440
-> Append (cost=0.06..8.13 rows=2 width=0) (actual time=0.009..0.009 rows=0 loops=2850)
Buffers: shared hit=11440
-> Index Only Scan using idx_phone on public.phone_numbers p (cost=0.06..4.06 rows=1 width=0) (actual time=0.003..0.003 rows=0 loops=2850)
Output: 1
Index Cond: ((p.contact_id = c.id) AND (p."national" = 5038904993::bigint) AND (p.e164 = '+15038904993'::text))
Heap Fetches: 11
Buffers: shared hit=5721
-> Index Only Scan using idx_email_full on public.email_addresses e (cost=0.06..4.06 rows=1 width=0) (actual time=0.003..0.003 rows=0 loops=2850)
Output: 1
Index Cond: ((e.contact_id = c.id) AND (e.email = 'robert.xxxx@xxxxxxxxx'::text))
Heap Fetches: 5
Buffers: shared hit=5719
Total runtime: 46.897 ms
(27 rows)
dft1fjfv106r48=> explain (analyze, buffers, verbose)
select *
from contacts where id IN (
(select c.id
from contacts c join phone_numbers p on c.id = p.contact_id and p.national = 5038904993 and p.e164 = '+15038904993')
union
(select c.id from contacts c join email_addresses e on c.id = e.contact_id and e.email = 'robert.xxxx@xxxxxxxxx'));
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=29.38..53.74 rows=6 width=95) (actual time=0.356..0.418 rows=7 loops=1)
Output: contacts.id, contacts.owner_id, contacts.user_id, contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype, contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname, contacts.fb_id
Buffers: shared hit=87
-> HashAggregate (cost=29.32..29.34 rows=6 width=8) (actual time=0.347..0.354 rows=7 loops=1)
Output: c.id
Buffers: shared hit=66
-> Append (cost=0.11..29.32 rows=6 width=8) (actual time=0.047..0.316 rows=16 loops=1)
Buffers: shared hit=66
-> Nested Loop (cost=0.11..8.12 rows=1 width=8) (actual time=0.045..0.169 rows=11 loops=1)
Output: c.id
Buffers: shared hit=43
-> Index Scan using idx_phone_address on public.phone_numbers p (cost=0.06..4.06 rows=1 width=8) (actual time=0.027..0.047 rows=11 loops=1)
Output: p.id, p.contact_id, p."national", p.e164, p.raw_number
Index Cond: ((p."national" = 5038904993::bigint) AND ((p.e164)::text = '+15038904993'::text))
Buffers: shared hit=9
-> Index Only Scan using idx_contacts_pkey_owner on public.contacts c (cost=0.06..4.06 rows=1 width=8) (actual time=0.005..0.006 rows=1 loops=11)
Output: c.id, c.owner_id, c.user_id
Index Cond: (c.id = p.contact_id)
Heap Fetches: 11
Buffers: shared hit=34
-> Nested Loop (cost=2.12..21.17 rows=5 width=8) (actual time=0.057..0.114 rows=5 loops=1)
Output: c_1.id
Buffers: shared hit=23
-> Bitmap Heap Scan on public.email_addresses e (cost=2.06..8.85 rows=5 width=8) (actual time=0.044..0.055 rows=5 loops=1)
Output: e.id, e.contact_id, e.email
Recheck Cond: ((e.email)::text = 'robert.xxxx@xxxxxxxxx'::text)
Buffers: shared hit=7
-> Bitmap Index Scan on idx_email_address (cost=0.00..2.06 rows=5 width=0) (actual time=0.031..0.031 rows=6 loops=1)
Index Cond: ((e.email)::text = 'robert.xxxx@xxxxxxxxx'::text)
Buffers: shared hit=2
-> Index Only Scan using idx_contacts_pkey_owner on public.contacts c_1 (cost=0.06..2.46 rows=1 width=8) (actual time=0.005..0.006 rows=1 loops=5)
Output: c_1.id, c_1.owner_id, c_1.user_id
Index Cond: (c_1.id = e.contact_id)
Heap Fetches: 5
Buffers: shared hit=16
-> Index Scan using idx_contacts_pkey_owner on public.contacts (cost=0.06..4.06 rows=1 width=95) (actual time=0.003..0.004 rows=1 loops=7)
Output: contacts.id, contacts.owner_id, contacts.user_id, contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype, contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname, contacts.fb_id
Index Cond: (contacts.id = c.id)
Buffers: shared hit=21
Total runtime: 0.535 ms
(40 rows)
dft1fjfv106r48=> explain (analyze, buffers, verbose)
select c.*
from contacts c
where EXISTS(select 1 from phone_numbers p where (p.national = 5038904993 and p.e164 = '+15038904993') and p.contact_id = c.id)
or EXISTS(select 1 from email_addresses e where e.email = 'robert.xxxx@xxxxxxxxx' and e.contact_id = c.id);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.contacts c (cost=0.00..23213.25 rows=2138 width=95) (actual time=0.209..1.290 rows=7 loops=1)
Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id
Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (alternatives: SubPlan 3 or hashed SubPlan 4))
Rows Removed by Filter: 2843
Buffers: shared hit=73
SubPlan 1
-> Index Only Scan using idx_phone on public.phone_numbers p (cost=0.06..4.06 rows=1 width=0) (never executed)
Index Cond: ((p.contact_id = c.id) AND (p."national" = 5038904993::bigint) AND (p.e164 = '+15038904993'::text))
Heap Fetches: 0
SubPlan 2
-> Index Scan using idx_phone_address on public.phone_numbers p_1 (cost=0.06..4.06 rows=1 width=8) (actual time=0.033..0.056 rows=11 loops=1)
Output: p_1.contact_id
Index Cond: ((p_1."national" = 5038904994::bigint) AND ((p_1.e164)::text = '+15038904993'::text))
Buffers: shared hit=9
SubPlan 3
-> Index Only Scan using idx_email_full on public.email_addresses e (cost=0.06..4.06 rows=1 width=0) (never executed)
Index Cond: ((e.contact_id = c.id) AND (e.email = 'robert.xxxx@xxxxxxxxx'::text))
Heap Fetches: 0
SubPlan 4
-> Bitmap Heap Scan on public.email_addresses e_1 (cost=2.06..8.85 rows=5 width=8) (actual time=0.040..0.050 rows=5 loops=1)
Output: e_1.contact_id
Recheck Cond: ((e_1.email)::text = 'robert.xxxx@xxxxxxxxx'::text)
Buffers: shared hit=7
-> Bitmap Index Scan on idx_email_address (cost=0.00..2.06 rows=5 width=0) (actual time=0.030..0.030 rows=6 loops=1)
Index Cond: ((e_1.email)::text = 'robert.xxxx@xxxxxxxxx'::text)
Buffers: shared hit=2
Total runtime: 1.395 ms
(27 rows)
dft1fjfv106r48=> explain (analyze, buffers, verbose)
select c.*
from contacts c
where c.id IN (
select p.contact_id from phone_numbers p where (p.national = 5038904993 and p.e164 = '+15038904993'))
or c.id IN (
select e.contact_id from email_addresses e where e.email = 'robert.xxxx@xxxxxxxxx');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.contacts c (cost=12.92..81.32 rows=2138 width=95) (actual time=0.208..1.283 rows=7 loops=1)
Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id
Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2))
Rows Removed by Filter: 2843
Buffers: shared hit=73
SubPlan 1
-> Index Scan using idx_phone_address on public.phone_numbers p (cost=0.06..4.06 rows=1 width=8) (actual time=0.032..0.054 rows=11 loops=1)
Output: p.contact_id
Index Cond: ((p."national" = 5038904993::bigint) AND ((p.e164)::text = '+15038904993'::text))
Buffers: shared hit=9
SubPlan 2
-> Bitmap Heap Scan on public.email_addresses e (cost=2.06..8.85 rows=5 width=8) (actual time=0.040..0.049 rows=5 loops=1)
Output: e.contact_id
Recheck Cond: ((e.email)::text = 'robert.xxxx@xxxxxxxxx'::text)
Buffers: shared hit=7
-> Bitmap Index Scan on idx_email_address (cost=0.00..2.06 rows=5 width=0) (actual time=0.031..0.031 rows=6 loops=1)
Index Cond: ((e.email)::text = 'robert.xxxx@xxxxxxxxx'::text)
Buffers: shared hit=2
Total runtime: 1.371 ms
(19 rows)
dft1fjfv106r48=>
On Fri, Nov 22, 2013 at 7:54 AM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
On Thu, Nov 21, 2013 at 2:31 PM, desmodemone <desmodemone@xxxxxxxxx> wrote:
> Hi Robert, could you try with "exists" ?
>
> SELECT c.*
> FROM contacts c
> WHERE exists ( SELECT 1 FROM phone p WHERE p.addr =? and
> p.contact_id=c.id )
> OR exists (SELECT 1 FROM email e WHERE e.addr = ? and e.contact_id=c.id );
hm, how about:
SELECT c.*
FROM contacts c
WHERE exists (
SELECT 1
FROM phone p
WHERE p.addr =? AND p.contact_id=c.id
UNION ALL
SELECT 1 FROM email e
WHERE e.addr = ? AND e.contact_id=c.id
);
merlin