Re: Slow query: select * order by XXX desc offset 10 limit 10

Alexander Farber <alexander.farber@xxxxxxxxx> · Fri, 14 Oct 2011 13:58:46 +0200

Hi Alban and others -

On Fri, Oct 14, 2011 at 1:34 PM, Alban Hertroys <haramrae@xxxxxxxxx> wrote:
> Anyway, I think you get the sequential scans because the UNION requires to sort all the data from both tables to guarantee that the results are unique (hence that long Sort Key at the 7th line of explain output).
> For that reason, an index on qdatetime alone won't help much, especially when most of your data has qdatetime <= now(), which is probably the case.
>
> It doesn't matter that you only want 10 results from that set, the database will first have to figure out which those rows are. That gets more complicated because they can come from two different tables, due to the UNION.
>
> Do you really need unique results from that view, or are duplicates acceptable (one from each table)? In that case, try UNION ALL instead of UNION.

I don't need unique at all!

So I've run "explain analyse" on the old view:

quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as
QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
quincyview where qdatetime <= now() order by QDATETIME desc offset 10
limit 10;

      QUERY PLAN

----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------
----------------------------------------------
 Limit  (cost=559200.14..559200.16 rows=10 width=1172) (actual
time=11311.537..11311.541 rows=10 loops=1)
   ->  Sort  (cost=559200.11..561534.85 rows=933894 width=1172)
(actual time=11311.532..11311.536 rows=20 loops=1)
         Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text))
         Sort Method:  top-N heapsort  Memory: 27kB
         ->  Subquery Scan quincyview  (cost=482985.36..534349.53
rows=933894 width=1172) (actual time=5778.592..9004.
663 rows=934084 loops=1)
               ->  Unique  (cost=482985.36..522675.85 rows=933894
width=254) (actual time=5777.972..7320.816 rows=9340
84 loops=1)
                     ->  Sort  (cost=482985.36..485320.09 rows=933894
width=254) (actual time=5777.969..6557.012 rows=
934084 loops=1)
                           Sort Key: quincynoreset.qdatetime,
quincynoreset.id, quincynoreset.name, quincynoreset.cate
gory, quincynoreset.appsversion, quincynoreset.osversion,
quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese
t.details, quincynoreset.devinfo, quincynoreset.email,
quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q
uincynoreset.formfactor, quincynoreset.copied
                           Sort Method:  external merge  Disk: 180992kB
                           ->  Append  (cost=0.00..55205.73
rows=933894 width=254) (actual time=11.592..2242.501 rows=
934084 loops=1)
                                 ->  Seq Scan on quincynoreset
(cost=0.00..39188.71 rows=808567 width=244) (actual ti
me=11.591..1739.695 rows=808647 loops=1)
                                       Filter: (qdatetime <= now())
                                 ->  Seq Scan on quincytrack
(cost=0.00..6678.09 rows=125327 width=315) (actual time=
6.801..298.642 rows=125437 loops=1)
                                       Filter: (qdatetime <= now())
 Total runtime: 11363.393 ms
(15 rows)

Then I'v dropped and recreated the view with "union all":

quincy=> drop view quincyview ;
DROP VIEW
quincy=> create view quincyview as
quincy->                 select
quincy->                         qdatetime,
quincy->                         id,
quincy->                         name,
quincy->                         category,
quincy->                         appsversion,
quincy->                         osversion,
quincy->                         beta_prog,
quincy->                         catinfo,
quincy->                         details,
quincy->                         devinfo,
quincy->                         email,
quincy->                         emailid,
quincy->                         imei,
quincy->                         pin,
quincy->                         formfactor,
quincy->                         copied
quincy->                 from quincynoreset
quincy->         union all
quincy->                 select
quincy->                         qdatetime,
quincy->                         id,
quincy->                         name,
quincy->                         category,
quincy->                         appsversion,
quincy->                         osversion,
quincy->                         beta_prog,
quincy->                         catinfo,
quincy->                         details,
quincy->                         devinfo,
quincy->                         email,
quincy->                         emailid,
quincy->                         imei,
quincy->                         pin,
quincy->                         formfactor,
quincy->                         copied
quincy->                 from quincytrack
quincy-> ;
CREATE VIEW

Then "explain analyse" on the same select query again:

quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as
QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from
quincyview where qdatetime <= now() order by QDATETIME desc offset 10
limit 10;
                                                                   QUERY PLAN

----------------------------------------------------------------------------------------------------------------------
--------------------------
 Limit  (cost=73052.13..73052.16 rows=10 width=111) (actual
time=3782.645..3782.649 rows=10 loops=1)
   ->  Sort  (cost=73052.11..75386.84 rows=933894 width=111) (actual
time=3782.640..3782.643 rows=20 loops=1)
         Sort Key: (to_char(quincynoreset.qdatetime, 'YYYY-MM-DD'::text))
         Sort Method:  top-N heapsort  Memory: 27kB
         ->  Result  (cost=0.00..48201.53 rows=933894 width=111)
(actual time=0.039..2660.561 rows=934084 loops=1)
               ->  Append  (cost=0.00..45866.79 rows=933894 width=111)
(actual time=0.021..1239.916 rows=934084 loops=
1)
                     ->  Seq Scan on quincynoreset
(cost=0.00..39188.71 rows=808567 width=95) (actual time=0.020..916
.249 rows=808647 loops=1)
                           Filter: (qdatetime <= now())
                     ->  Seq Scan on quincytrack  (cost=0.00..6678.09
rows=125327 width=215) (actual time=0.030..125.6
49 rows=125437 loops=1)
                           Filter: (qdatetime <= now())
 Total runtime: 3782.759 ms
(11 rows)

Now the script is noticably more enjoyable, thank you!

Do I still need to add indices over the whole union
and what's the syntax please?

I'm also thinking about adding some "pipelining"
(i.e. prefetching 5-10 pages for the HTML-table):
http://datatables.net/release-datatables/examples/server_side/pipeline.html

Regards
Alex

>
> If you do need unique results, then you could create an index on the combination of all those fields. That should take out the need for those sequential scans.

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general