On Wed, Jun 10, 2015 at 12:32 PM, Patrick Krecker <patrick@xxxxxxxxxxxx> wrote: > Hi everyone -- > > I had an issue the other day where a relatively simple query went from > taking about 1 minute to execute to taking 19 hours. It seems that the > planner chooses to use a materialize sometimes [1] and not other times > [2]. I think the issue is that the row count estimate for the result > of the condition "type_id = 23 and ref.attributes ? 'reference'" is > about 10k rows, but the actual result is 4624280. It seems the > estimate varies slightly over time, and if it drops low enough then > the planner decides to materialize the result of the bitmap heap scan > and the query takes forever. > > As an exercise, I tried removing the clause "ref.attributes ? > 'reference'" and the estimates are very accurate [3]. This is a fundamental issue with using 'database in a box' datatypes like hstore and jsonb. They are opaque to the statistics gathering system and so are unable to give reasonable estimates beyond broad assumptions. Speaking generally, the workarounds are too: *) disable particular plan choices for this query (materialize/nestloop are common culprits) *) create btree indexes around specific extraction clauses *) refactor some of the query into set returning function with a custom ROWS clause *) try alternate indexing strategy such as jsonb/jsquery *) move out of hstore and into more standard relational strucure none of the above may be ideal in your particular case. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance