> -----Original Message----- > From: pgsql-performance-owner@xxxxxxxxxxxxxx > [mailto:pgsql-performance-owner@xxxxxxxxxxxxxx] On Behalf Of Sam Wong > Sent: Thursday, September 26, 2013 0:34 > To: 'postgres performance list' > Subject: Re: Slow plan for MAX/MIN or LIMIT 1? > > > -----Original Message----- > > From: pgsql-performance-owner@xxxxxxxxxxxxxx > > [mailto:pgsql-performance-owner@xxxxxxxxxxxxxx] On Behalf Of Merlin > > Moncure > > Sent: Wednesday, September 25, 2013 23:55 > > To: Claudio Freire > > Cc: Sam Wong; postgres performance list > > Subject: Re: Slow plan for MAX/MIN or LIMIT 1? > > > > On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire > > <klaussfreire@xxxxxxxxx> > > wrote: > > > On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure > > > <mmoncure@xxxxxxxxx> > > wrote: > > >> On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire > <klaussfreire@xxxxxxxxx> > > wrote: > > >>> On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong <sam@xxxxxxxxxxxx> > wrote: > > >>>> This event_log table has 4 million rows. > > >>>> > > >>>> "log_id" is the primary key (bigint), > > >>>> > > >>>> there is a composite index "event_data_search" over (event::text, > > >>>> insert_time::datetime). > > >>> > > >>> > > >>> I think you need to add log_id to that composite index to get pg > > >>> to > use it. > > >> > > >> hurk: OP is two statistics misses (one of them massive that are > > >> combing to gobsmack you). > > >> > > >> your solution unfortuantely wont work: you can't combine two range > > >> searches in a single index scan. it would probably work if you it > > >> like this. If insert_time is a timestamp, not a timestamptz, we > > >> can convert it to date to get what I think he wants (as long as his > > >> queries are along date boundaries). > > > > > > > > > I was thinking an index over: > > > > > > (event, date_trunc('day', insert_time), log_id) > > > > > > And the query like > > > > > > SELECT min(log_id) FROM event_log > > > WHERE event='S-Create' AND > > > date_trunc('day',insert_time) = '2013-09-15' > > > > > > > > > That's a regular simple range scan over the index. > > > > *) date_trunc has same problems as ::date: it is stable expression > > only > for > > timestamptz. also, the index will be bigger since you're still > > indexing timestamp > > > > *) row wise comparison search might be faster and is generalized to > > return > N > > records, not jut one. > > > > merlin > I'm afraid it's not anything about composite index. (Because the query B works > fine and the plan is as expected) BTW, the timestamp is without timezone. > > I just thought of another way that demonstrate the issue. > Both query returns the same one row result. log_id is the bigint primary key, > event_data_search is still that indexed. > --- > Fast query > --- Excuse me, this is the actual query. with Q AS ( select event from event_log WHERE log_id>1010000 and log_id<1050000 order by event ) SELECT * FROM Q LIMIT 1 > > Limit (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342 > rows=1 loops=1) > Output: q.event > Buffers: shared hit=93 read=622 > CTE q > -> Sort (cost=2502.07..2521.82 rows=39502 width=6) (actual > time=88.335..88.335 rows=1 loops=1) > Output: event_log.event > Sort Key: event_log.event > Sort Method: quicksort Memory: 3486kB > Buffers: shared hit=93 read=622 > -> Index Scan using event_log_pkey on uco.event_log > (cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573 > rows=39999 loops=1) > Output: event_log.event > Index Cond: ((event_log.log_id > 1010000) AND > (event_log.log_id < 1050000)) > Buffers: shared hit=93 read=622 > -> CTE Scan on q (cost=0.00..237.01 rows=39502 width=32) (actual > time=88.340..88.340 rows=1 loops=1) > Output: q.event > Buffers: shared hit=93 read=622 > Total runtime: 89.039 ms > > --- > Slow Query > --- And the query for this...I must have forgot to paste. select min(event) from event_log WHERE log_id>1010000 and log_id<1050000 > Result (cost=1241.05..1241.05 rows=1 width=0) (actual > time=1099.532..1099.533 rows=1 loops=1) > Output: $0 > Buffers: shared hit=49029 read=57866 > InitPlan 1 (returns $0) > -> Limit (cost=0.00..1241.05 rows=1 width=6) (actual > time=1099.527..1099.527 rows=1 loops=1) > Output: uco.event_log.event > Buffers: shared hit=49029 read=57866 > -> Index Scan using event_data_search on uco.event_log > (cost=0.00..49024009.79 rows=39502 width=6) (actual > time=1099.523..1099.523 > rows=1 loops=1) > Output: uco.event_log.event > Index Cond: (uco.event_log.event IS NOT NULL) > Filter: ((uco.event_log.log_id > 1010000) AND > (uco.event_log.log_id < 1050000)) > Rows Removed by Filter: 303884 > Buffers: shared hit=49029 read=57866 Total runtime: > 1099.568 ms > (Note: Things got buffered so it goes down to 1 second, but comparing to the > buffer count with the query above this is a few orders slower than that) > > --- > The CTE "fast query" works, it is completed with index scan over > "event_log_pkey", which is what I am expecting, and it is good. > But it's much straight forward to write it as the "slow query", I am expecting > the planner would give the same optimum plan for both. > > For the plan of "Slow query" has an estimated total cost of 1241.05, and the > "Fast query" has 2521.83, hence the planner prefers that plan - the scanning > over the "event_data_search" index plan - but this choice doesn't make sense > to me. > > This is my point I want to bring up, why the planner would do that? Instead of > scanning over the "event_log_pkey"? > > Looking into the estimated cost of the Slow Query Plan, the Index Scan node > lower bound estimation is 0.00. IIRC, it is saying the planner estimated that the > first result row could be returned at 0.00. > but actually it has to do a scan almost the whole index and table scan to check > if the log_id are within the condition range, there is no way that the first row > could ever be returned at 0.00. > (The event is a text column with cardinality at around 20, and the order is > appearing randomly all over the table - 0 correlation) Hence this questionable > estimation bubble up along the tree, the Limit node thought that it could finish > within 1241.05 (once the first row is obtained), and the whole plan is thought to > be finisable within 1241.05 - which is not the case. > > Sam > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance