* Tom Lane: > Yeah, n_distinct estimation from a sample is inherently hard :-(. Given > that you have such a long tail on the distribution, it might be worth > your while to crank the stats target for that column all the way to the > maximum (1000). I've done that. Fortunately, ANALYZE time didn't increase by that much, compared to the default (by just a factor of 10). The bitmap scan estimate is still way off (around 8000), but let's hope that it won't matter in practice. > Also you need to experiment with extending the stats for the smaller > table. Yeah, the situation is quite similar, but on a much smaller scale. > I believe what's happening here is that the smaller table joins only to > less-frequent entries in the big table (correct?). Almost. We won't select the rows based on these values, at least not in queries of that type. The reason is simply that the result set is too large to be useful. > Of course, large stats targets will slow down planning to some extent, > so you should also keep an eye on how long it takes to plan the query. These queries are mostly ad-hoc, so a delay of a couple of seconds doesn't matter. Only if you need to wait five minutes, it's a different story. It seems that the situation is under control now. Thanks. -- Florian Weimer <fweimer@xxxxxx> BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47 tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99