Re: Row estimates off by two orders of magnitude with hstore

Patrick Krecker <patrick@xxxxxxxxxxxx> · Wed, 10 Jun 2015 13:55:39 -0700

OK. Well, fortunately for us, we have a lot of possible solutions this
problem, and it sounds like actually getting statistics for attributes
? 'reference' is not realistic. I just wanted to make sure it wasn't
some configuration error on our part.

Can anyone explain where exactly the estimate for that clause comes from?

I tried adding an index and I don't think it improved the estimation,
the planner still thinks there will be 9k rows as a result of type_id
= 23 and attributes ? 'reference'. [1]. It might make the pathological
plan less likely though. It's not clear to me that it reduces the risk
of a pathological plan to zero.

I also tried wrapping it in a subquery [2]. The estimate is, of
course, still awful, but it doesn't matter anymore because it can't
pick a plan that leverages its low estimate. Its only choice is a
simple filter on the results.

[1]
# CREATE INDEX foobarbaz ON component((attributes -> 'reference'))
WHERE ( attributes ? 'reference' );

CREATE INDEX

judicata=# explain (analyze, buffers) declare "foo_cursor" cursor for
SELECT     ref.case_id,     array_agg(ref.attributes -> 'reference')
FROM component ref JOIN document c ON c.id = ref.case_id WHERE
ref.type_id = 23 AND ref.attributes ? 'reference' AND NOT 0 =
ANY(c.types) GROUP BY ref.case_id;

 QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=419667.86..419667.99 rows=10 width=34) (actual
time=97074.773..97197.487 rows=90969 loops=1)
   Buffers: shared hit=16954389 read=4533956 dirtied=2963 written=4759
   ->  Nested Loop  (cost=5472.44..419628.76 rows=5213 width=34)
(actual time=537.202..94710.844 rows=2488142 loops=1)
         Buffers: shared hit=16954389 read=4533956 dirtied=2963 written=4759
         ->  Bitmap Heap Scan on component ref
(cost=5468.01..342716.88 rows=9087 width=34) (actual
time=534.862..49617.945 rows=4624280 loops=1)
               Recheck Cond: (attributes ? 'reference'::text)
               Rows Removed by Index Recheck: 28739170
               Filter: (type_id = 23)
               Rows Removed by Filter: 165268
               Buffers: shared hit=25 read=921758 dirtied=2963 written=906
               ->  Bitmap Index Scan on foobarbaz  (cost=0.00..5465.74
rows=98636 width=0) (actual time=532.215..532.215 rows=4789548
loops=1)
                     Buffers: shared read=59300 written=57
         ->  Bitmap Heap Scan on document c  (cost=4.43..8.45 rows=1
width=4) (actual time=0.009..0.009 rows=1 loops=4624280)
               Recheck Cond: (id = ref.case_id)
               Filter: (0 <> ALL (types))
               Rows Removed by Filter: 0
               Buffers: shared hit=16954364 read=3612198 written=3853
               ->  Bitmap Index Scan on document_pkey
(cost=0.00..4.43 rows=1 width=0) (actual time=0.003..0.003 rows=1
loops=4624280)
                     Index Cond: (id = ref.case_id)
                     Buffers: shared hit=14082540 read=1859742 written=1974
 Total runtime: 97217.718 ms

[2]
# explain (analyze, buffers) declare "foo_cursor" cursor for  SELECT *
FROM (SELECT     ref.case_id as case_id,     array_agg(ref.attributes
-> 'reference') as reference FROM component ref JOIN document c ON
c.id = ref.case_id WHERE ref.type_id = 23 AND NOT 0 = ANY(c.types)
GROUP BY ref.case_id) as t WHERE reference IS NOT NULL;

              QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=5636347.52..12524155.45 rows=9817 width=34)
(actual time=165466.502..195035.433 rows=93580 loops=1)
   Filter: (array_agg((ref.attributes -> 'reference'::text)) IS NOT NULL)
   Buffers: shared hit=13884 read=2085572 written=2952, temp
read=902337 written=902337
   ->  Merge Join  (cost=5636347.52..12458841.11 rows=5213367
width=34) (actual time=165383.814..193813.490 rows=5115136 loops=1)
         Merge Cond: (c.id = ref.case_id)
         Buffers: shared hit=13884 read=2085572 written=2952, temp
read=902337 written=902337
         ->  Index Scan using document_pkey on document c
(cost=0.43..6696889.20 rows=2128590 width=4) (actual
time=0.009..24720.726 rows=94634 loops=1)
               Filter: (0 <> ALL (types))
               Rows Removed by Filter: 70829
               Buffers: shared hit=13852 read=195821
         ->  Materialize  (cost=5636345.76..5681782.42 rows=9087332
width=34) (actual time=165383.798..168027.149 rows=9120904 loops=1)
               Buffers: shared hit=32 read=1889751 written=2952, temp
read=902337 written=902337
               ->  Sort  (cost=5636345.76..5659064.09 rows=9087332
width=34) (actual time=165383.793..167173.325 rows=9120904 loops=1)
                     Sort Key: ref.case_id
                     Sort Method: external merge  Disk: 1392648kB
                     Buffers: shared hit=32 read=1889751 written=2952,
temp read=902337 written=902337
                     ->  Bitmap Heap Scan on component ref
(cost=481859.39..3592128.04 rows=9087332 width=34) (actual
time=20950.899..145515.599 rows=9120904 loops=1)
                           Recheck Cond: (type_id = 23)
                           Rows Removed by Index Recheck: 57286889
                           Buffers: shared hit=32 read=1889751 written=2952
                           ->  Bitmap Index Scan on component_type_id
(cost=0.00..479587.56 rows=9087332 width=0) (actual
time=20947.739..20947.739 rows=12143019 loops=1)
                                 Index Cond: (type_id = 23)
                                 Buffers: shared read=164918 written=2816
 Total runtime: 195213.232 ms

On Wed, Jun 10, 2015 at 1:01 PM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
> On Wed, Jun 10, 2015 at 2:40 PM, Josh Berkus <josh@xxxxxxxxxxxx> wrote:
>> On 06/10/2015 11:32 AM, Merlin Moncure wrote:
>>> This is a fundamental issue with using 'database in a box' datatypes
>>> like hstore and jsonb.  They are opaque to the statistics gathering
>>> system and so are unable to give reasonable estimates beyond broad
>>> assumptions.  Speaking generally, the workarounds are too:
>>>
>>> *) disable particular plan choices for this query
>>> (materialize/nestloop are common culprits)
>>>
>>> *) create btree indexes around specific extraction clauses
>>>
>>> *) refactor some of the query into set returning function with a
>>> custom ROWS clause
>>>
>>> *) try alternate indexing strategy such as jsonb/jsquery
>>>
>>> *) move out of hstore and into more standard relational strucure
>>
>> You forgot:
>>
>> *) Fund a PostgreSQL developer to add selectivity estimation and stats
>> to hstore.
>
> Well, I don't know.  That's really complex to the point of making me
> wonder if it's worth doing even given infinite time and resources.  If
> it was my money, I'd be researching a clean way to inject estimate
> returning expressions into the query that the planner could utilize.
> Not 'hints' which are really about managing the output of the planner,
> just what feeds in.  Also lots of various solutions of alcohol to
> lubricate the attendant -hackers discussions.
>
> merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance