Re: wildcard text filter switched to boolean column, performance is way worse

Mike Broers <mbroers@xxxxxxxxx> · Tue, 7 Jul 2015 11:28:17 -0500

After bumping up work_mem from 12MB to 25MB that last materialize is indeed hashing and this cut the query time by about 60%.  Thanks, this was very helpful and gives me something else to look for when troubleshooting explains.  

On Tue, Jul 7, 2015 at 11:10 AM, Mike Broers <mbroers@xxxxxxxxx> wrote:
Thanks, very informative! I'll experiment with work_mem settings and report back.

On Tue, Jul 7, 2015 at 11:02 AM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Mike Broers <mbroers@xxxxxxxxx> writes:

> I had a query that was filtering with a wildcard search of a text field for

> %SUCCESS%. The query took about 5 seconds and was running often so I wanted

> to improve it.  I suggested that the engineers include a new boolean column

> for successful status.  They implemented the requested field, but the query

> that filters on that new column runs very long (i kill it after letting it

> run for about an hour).  Can someone help me understand why that is the

> case and how to resolve it?

It's hashing the subplan output in the first case and not the second:

> Seq Scan on lead  (cost=130951.81..158059.21 rows=139957 width=369) (actual

> time=4699.619..4699.869 rows=1 loops=1)

>     Filter: ((NOT (hashed SubPlan 1)) AND (("ReferenceNumber")::text <>

> ''::text) AND ((NOT (hashed SubPlan 2)) OR (NOT (hashed SubPlan 3))))

                                                   ^^^^^^^^^^^^^^^^

vs

>   Seq Scan on lead  (cost=85775.78..9005687281.12 rows=139957 width=369)

>     Filter: ((NOT (hashed SubPlan 1)) AND (("ReferenceNumber")::text <>

> ''::text) AND ((NOT (hashed SubPlan 2)) OR (NOT (SubPlan 3))))

                                                   ^^^^^^^^^

Presumably, the new more-accurate rows count causes the planner to realize

that the hash table will exceed work_mem so it doesn't choose to hash ...

but for your situation, you'd rather it did, because what you're getting

instead is a Materialize node that spills to disk (again, because the data

involved exceeds work_mem) and that's a killer for this query.  You should

be able to get back the old behavior if you raise work_mem enough.

Another idea you might think about is changing the OR'd IN conditions

to a single IN over a UNION ALL of the subselects.  I'm not really sure if

that would produce a better plan, but it's worth trying if it wouldn't

require too much app-side contortion.

                        regards, tom lane