On 1/22/08, Richard Huxton <dev@xxxxxxxxxxxx> wrote: > Although the row-estimate still seems quite high. You might want to > increase it even further (maximum is 1000). If this is a common query, > I'd look at an index on (user,id) rather than just (user) perhaps. Actually that index (with the same statistics setting as before) yields slightly worse performance: # explain analyze select max(id) from user_messages where user_id = 13604; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=3.86..3.87 rows=1 width=0) (actual time=0.051..0.052 rows=1 loops=1) InitPlan -> Limit (cost=0.00..3.86 rows=1 width=4) (actual time=0.045..0.045 rows=0 loops=1) -> Index Scan Backward using user_messages_user_id_id on user_messages (cost=0.00..1486.79 rows=385 width=4) (actual time=0.042..0.042 rows=0 loops=1) Index Cond: (user_id = 13604) Filter: (id IS NOT NULL) Total runtime: 0.128 ms Compare with the plain index on the one attribute: # explain analyze select max(id) from user_messages where user_id = 13604; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=1388.34..1388.35 rows=1 width=4) (actual time=0.034..0.035 rows=1 loops=1) -> Index Scan using user_messages_user on user_messages (cost=0.00..1387.40 rows=374 width=4) (actual time=0.030..0.030 rows=0 loops=1) Index Cond: (user_id = 13604) Total runtime: 0.085 ms > > But it's worrying that PostgreSQL should be so off in planning the > > query. Does this behaviour qualify as a bug, or is this -- that is, > > the need to tweak statistics parameters -- just your garden-variety > > application-specific optimization? > > Well, it's data-specific rather than application specific I suppose. The > issue is that there is a cost to tracking 100 values and you don't > want to pay that on every column in every table. If user 13604 isn't in > the list of most-common users then all it can really do is fix an upper > bound on how many matches it can have. Of course you and I can reason > outside of the data and guess that manu users won't have more than a > handful of messages, but that's not something PG can do. Absolutely. Thanks for the pointers. Alexander. ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq