Re: Efficiency of EXISTS?

Merlin Moncure <mmoncure@xxxxxxxxx> · Mon, 23 Jul 2012 16:52:33 -0500

On Mon, Jul 23, 2012 at 4:12 PM, Kenneth Tilton <ktilton@xxxxxxxx> wrote:
> My mental model of the EXISTS clause must be off. This snippet appears at
> the end of a series of WITH clauses I suspect are irrelevant:
>
>> with etc etc ... , cids as
>>   (select distinct c.id from ddr2 c
>> join claim_entries ce on ce.claim_id = c.id
>> where (c.assigned_ddr = 879
>> or exists (select 1 from ddr_cdt dc
>> where
>> dc.sys_user_id = 879
>> and dc.document_type = c.document_type
>> -- makes it faster: and (dc.cdt_code is null or dc.cdt_code = ce.cpt_code)
>> )))
>>
>> select count(*) from cids
>
>
> If I uncomment the bit where it says "make it faster" I get decent response
> and the graphical analyze display shows the expected user+doctype+cdtcode
> index is being used (and nice thin lines suggesting efficient lookup).
>
> As it is, the analyze display shows the expected user+doctype index* being
> used but the lines are fat, and performance is an exponential disaster.
>
> * I created the (to me ) redundant user+doctype index trying to get Postgres
> to Do the Right Thing(tm), but I can see that was not the issue.
>
> I presume the reason performance drops off a cliff is because there can be
> 9000 cdt_codes for one user+doctype, but I was hoping EXISTS would just look
> to see if there was at least one row matching user+doctype and return its
> decision. I have tried select *, select 1, and limit 1 on the nested select
> to no avail.
>
> Am I just doing something wrong? I am a relative noob. Is there some other
> hint I can give the planner?

hard to say without having the explain analyze output.  also it's not
clear why you need to use WITH, at least for the terminating query.
I'd just do:

select count(*) from
(
  inner_query
)

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance