Shaun Thomas <sthomas@xxxxxxxxxxxxxxxx> wrote: > So, we know we have a way of doing a loose index scan with CTEs: > > http://wiki.postgresql.org/wiki/Loose_indexscan I tried this on a table in production with 23 million rows for a column with 45 distinct values which is the high-order column of a four-column index. This ran in 445 ms first time and 2 ms on the second and subsequent tries. The equivalent SELECT DISTINCT ran in 30 seconds first time, and got down to 11.5 seconds after a few runs. So roughly two orders of magnitude faster with a cold cache and three orders of magnitude faster with a warm cache. That sure would be a nice optimization to have in the planner. > But that got me wondering. The planner knows from pg_stats that > col1 could have low cardinality. If that's the case, and a WHERE > clause uses a two column index, and col2 is specified, why can't > it walk each individual bucket in the two-column index, and use > col2? So I forced such a beast with a CTE: > > WITH RECURSIVE t AS ( > SELECT min(col1) AS col1 > FROM tablename > UNION ALL > SELECT (SELECT min(col1) > FROM tablename > WHERE col1 > t.col1) > FROM t > WHERE t.col1 IS NOT NULL > ) > SELECT p.* > FROM t > JOIN tablename p USING (col1) > where p.col2 = 12345 > > I ask, because while the long-term fix would be to re-order the > index to (col2, col1), this seems like a situation the planner > could easily detect and compensate for. In our particular example, > execution time went from 160ms to 2ms with the CTE rewrite. Well, that'd be the icing on the cake. I'd be overjoyed to get the cake. :-) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance