Claudio Freire <klaussfreire@xxxxxxxxx> writes: > On Thu, Apr 26, 2012 at 2:37 PM, Claudio Freire <klaussfreire@xxxxxxxxx> wrote: >> Fun thing is, nothing in the CTE's execution really changed. The only >> change, is that now a sequential scan of overview was chosen instead >> of the index. >> Why could this be? The output (number of search values, even the >> values themselves and their order) is the same between both plans. The estimated size of the UNION output is a lot different, thus discouraging use of a nestloop for the outer query's join. > I just noticed it's misestimating the output of the union distinct, > but not of the select distinct. > One would expect the estimation procedure to be the same in both cases. No, I don't think that follows. The UNION code is aware that people frequently write UNION rather than UNION ALL even when they're not expecting any duplicates, so it shies away from assuming that UNION will reduce the number of rows at all. On the other hand, it's not at all common to write SELECT DISTINCT unless you actually expect some duplicate removal to happen, so the default assumption in the absence of any stats is different --- looks like it's assuming 10X compression by the DISTINCT operation. The real issue here of course is that we have no useful idea how many rows will be produced by the recursive-union CTE, let alone what their statistics will be like. So the planner is falling back on rules of thumb that are pretty much guaranteed to be wrong in any particular case. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance