Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > The point here is that you've repeated the same example N times > without actually making a case that it's interesting to support. We > have to think about the intellectual complexity that would be added > to the planner to support this case, and the cycles that would be > expended on every query (and wasted, for most queries) on trying to > detect whether the case applies. If it were simple and cheap to do, > these arguments wouldn't hold much weight, but it doesn't look to > me like either is the case. > > Another problem is that it's not clear there's much to be gained. > Avoiding the sort step is only interesting if the query produces so > many rows that a sort would be expensive ... but if that's the case, it > seems unlikely that a nestloop indexscan plan would be the best > choice anyway. > > So basically this looks like a lot of work for a narrow and questionable > gain. If you want it to happen you need to convince people that it's > easier and more useful than it looks. > > regards, tom lane Well, I probably won't tell you that it's easy to do, because you know the planner far far better than I do -- I'm just a user who knows almost nothing about what's happening inside it. I just thought that if the planner is examining a plan, where the outer scan is using a unique index with all it's columns AND doing a nestloop join with the inner scan returning the rows in some order, then the planner could say "this is already ordered by (outer.unique, inner.some_order)" and wouldn't make a sort at the end. Of course this is far from perfect, because what about three/four/more table joins... maybe that can be derived from this, I don't really know now. Another situation that could be optimized this way is if I write "ORDER BY outer.idxcol1, outer.idxcol2, outer.id, inner.something" where + there is a non-unique index on (outer.idxcol1, outer.idxcol2), + outer.id is a PRI KEY, + there is an index on (inner.outer_id, inner.something) too but that's getting really complicated. Of course if the simpler case would be working, then I could create a unique index on (outer.idxcol1, outer.idxcol2, outer.id) and this would run optimized too. I think what's common is: + outer scan produces uniquely sorted rows + nested loop + inner scan produces sorted rows + ORDER BY says outer.unique_sort, inner.sort If this is the situation, the sort could be done separately. Even like: -> Nested Loop -> Sort by outer.unique -> Scan producing the required rows from outer -> Sort by inner.something -> Scan producing the required rows from inner Filter or Index Cond for the join And this is good for any query that needs data from two tables, but wants to get the joined rows so that all rows for a single outer-row are "packed together" (I mean they do not mix). Now about if it's worth it. I tried a query on real data, with lots of rows. Tables involved: - threads: ~200K rows - msgs: ~8M rows I wanted to see the last msgs from the last threads, but did not want to mix them. It's like PgSQL's mailing archive viewed by threads. I wanted them paged, not all 8M msgs at once (LIMIT/OFFSET). Query is: SELECT * FROM threads AS thr JOIN msgs AS msg ON msg.thrid = thr.id WHERE thr.forid = 1 ORDER BY thr.id DESC, msg.msgid DESC LIMIT 100 thr.forid is a FKEY to forums. Say forum 1 is the pgsql-performance list. Table msgs has only a PRI KEY: (thrid, msgid). Table threads has a PRI KEY: (id) and a unique index: (forid, id). First time I ran the query with seqscan, bitmapscan, hashjoin, mergejoin disabled (just to get the nested loop plan with the needless sort). Second time I ran it without disabling anything, but I modified ORDER BY to only "thr.id DESC" (deleted "msg.msgid DESC"), to give me the same plan as before, but without the sort. PSQL input and output attached. I think the 5000x speed would be worth it. Regards, Denes Daniel --------------------------------- Játssz a megújult Kvízparton! Válassz több mint 400 kvíz közül minden témában! ________________________________________________________ http://cthandler.adverticum.net/?cturl=http%3A%2F%2Fkvizpart.hu%2F?fmalja
-- =========== Test #1 =========== -- BEGIN; SET enable_seqscan TO false; SET enable_bitmapscan TO false; SET enable_hashjoin TO false; SET enable_mergejoin TO false; EXPLAIN ANALYZE SELECT * FROM threads AS thr JOIN msgs AS msg ON msg.thrid = thr.id WHERE thr.forid = 1 ORDER BY thr.id DESC, msg.msgid DESC LIMIT 100; ROLLBACK; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=64537577.92..64537677.92 rows=100 width=447) (actual time=61923.813..61924.379 rows=100 loops=1) -> Sort (cost=64537577.92..64633839.92 rows=96262 width=447) (actual time=61923.804..61924.038 rows=100 loops=1) Sort Key: thr.id, msg.msgid -> Nested Loop (cost=0.00..61339121.37 rows=96262 width=447) (actual time=0.248..40366.113 rows=431350 loops=1) -> Index Scan using threads_uni_forid_id on threads thr (cost=0.00..14229.22 rows=2850 width=176) (actual time=0.127..5246.470 rows=8339 loops=1) Index Cond: (forid = 1) -> Index Scan using msgs_pkey on msgs msg (cost=0.00..12595.51 rows=2974 width=271) (actual time=1.544..4.000 rows=52 loops=8339) Index Cond: (msg.thrid = "outer".id) Total runtime: 62079.452 ms (9 rows) -- =========== Test #2 =========== -- EXPLAIN ANALYZE SELECT * FROM threads AS thr JOIN msgs AS msg ON msg.thrid = thr.id WHERE thr.forid = 1 ORDER BY thr.id DESC LIMIT 100; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..63721.01 rows=100 width=447) (actual time=9.579..10.775 rows=100 loops=1) -> Nested Loop (cost=0.00..61339121.37 rows=96262 width=447) (actual time=9.574..10.438 rows=100 loops=1) -> Index Scan Backward using threads_uni_forid_id on threads thr (cost=0.00..14229.22 rows=2850 width=176) (actual time=0.082..0.145 rows=4 loops=1) Index Cond: (forid = 1) -> Index Scan using msgs_pkey on msgs msg (cost=0.00..12595.51 rows=2974 width=271) (actual time=2.399..2.475 rows=25 loops=4) Index Cond: (msg.thrid = "outer".id) Total runtime: 11.071 ms (7 rows)
---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq