Thanks a lot for your detailed explanation. I will try ASAP with no DISTINCT ( we are quite sure it is not needed anyway ), and report back here.
Michaël2017-08-21 23:52 GMT+02:00 David Rowley <david.rowley@xxxxxxxxxxxxxxx>:
On 19 August 2017 at 04:46, kimaidou <kimaidou@xxxxxxxxx> wrote:
> When we call the WHERE on the view:
>
> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
> SELECT *
> FROM "qgep"."vw_qgep_reach"
> WHERE "progression_geometry" &&
> st_makeenvelope(1728327.03249295568093657,8240789. This is by design due to the DISTINCT ON() clause. Only quals which26074041239917278,1728608. 10987572139129043,8240958. 16933418624103069,3949)
>
>
> The query plan is "wrong", as PostgreSQL seems to consider it should do a
> seq scan on the tables, and only afterwards filter with the WHERE:
> https://explain.depesz.com/s/wXV
>
> The query takes about 1 second instead of less than 100ms.
>
> Do you have any hint on this kind of issue ?
filter columns which are in the DISTINCT ON can be safely pushed down.
Consider the following, where I've manually pushed the WHERE clause.
postgres=# create table tt (a int, b int);
CREATE TABLE
postgres=# create index on tt (a);
CREATE INDEX
postgres=# insert into tt values(1,1),(1,2),(2,1),(2,2);
INSERT 0 4
postgres=# select * from (select distinct on (a) a,b from tt order by
a,b) tt where b = 2;
a | b
---+---
(0 rows)
postgres=# select * from (select distinct on (a) a,b from tt where b =
2 order by a,b) tt;
a | b
---+---
1 | 2
2 | 2
(2 rows)
Note the results are not the same.
If I'd done WHERE a = 2, then the planner would have pushed the qual
down into the subquery.
More reading in check_output_expressions() in allpaths.c:
/* If subquery uses DISTINCT ON, check point 3 */
if (subquery->hasDistinctOn &&
!targetIsInSortList(tle, InvalidOid, subquery->distinctClause))
{
/* non-DISTINCT column, so mark it unsafe */
safetyInfo->unsafeColumns[tle->resno] = true;
continue;
}
The comment for point 3 reads:
* 3. If the subquery uses DISTINCT ON, we must not push down any quals that
* refer to non-DISTINCT output columns, because that could change the set
* of rows returned. (This condition is vacuous for DISTINCT, because then
* there are no non-DISTINCT output columns, so we needn't check. Note that
* subquery_is_pushdown_safe already reported that we can't use volatile
* quals if there's DISTINCT or DISTINCT ON.)
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services