Search Postgresql Archives

Re: Using window functions to get the unpaginated count for paginated queries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Clemens Park wrote:
> Recently, during a performance improvement sweep for an application at
my company, one of the hotspots
> that was discovered was pagination.
> 
> In order to display the correct pagination links on the page, the
pagination library we used (most
> pagination libraries for that matter) ran the query with OFFSET and
LIMIT to get the paginated
> results, and then re-ran the query without the OFFSET and LIMIT and
wrapped them in a SELECT COUNT(*)
> FROM main_query to get the total number of rows.
> 
> In an attempt to optimize this, we used a window function as follows:
> 
> Given a query that looked as follows:
> 
> SELECT a,b,c
> FROM table
> WHERE clauses
> OFFSET x LIMIT y;
> 
> add total_entries_count column as follows:
> 
> SELECT COUNT(table.id) OVER () AS total_entries_count, a,b,c
> FROM table
> WHERE clauses
> OFFSET x LIMIT y;
> 
> This calculates the total number of unpaginated rows correctly,
without affecting the runtime of the
> query.  At least as far as I can tell.

It can affect the runtime considerably.

I created a 10^6 row test table and tried:

test=> EXPLAIN ANALYZE SELECT id, val FROM large OFFSET 100 LIMIT 10;
                                                    QUERY PLAN
------------------------------------------------------------------------
------------------------------------------
 Limit  (cost=1.49..1.64 rows=10 width=12) (actual time=0.177..0.195
rows=10 loops=1)
   ->  Seq Scan on large  (cost=0.00..14902.00 rows=1000000 width=12)
(actual time=0.028..0.114 rows=110 loops=1)
 Total runtime: 0.251 ms
(3 rows)

test=> EXPLAIN ANALYZE SELECT id, val, COUNT(*) OVER () AS
total_entries_count FROM large OFFSET 100 LIMIT 10;
                                                          QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------
 Limit  (cost=2.74..3.01 rows=10 width=12) (actual
time=1893.606..1893.625 rows=10 loops=1)
   ->  WindowAgg  (cost=0.00..27402.00 rows=1000000 width=12) (actual
time=1893.435..1893.559 rows=110 loops=1)
         ->  Seq Scan on large  (cost=0.00..14902.00 rows=1000000
width=12) (actual time=0.025..647.182 rows=1000000 loops=1)
 Total runtime: 1915.255 ms
(4 rows)

That is because the second query will have to scan all rows, while the
first one can stop scanning after 110 rows.

> The questions I have are:
> 
> 1) Are there any adverse effects that the above window function can
have?

I can only think of the performance degradation mentioned above.

> 2) Are there any cases where the count would return incorrectly?

No.

> 3) In general, is this an appropriate use-case for using window
functions?

I think it is.

Maybe you can do better if you don't retrieve the total count
of rows for every set of rows you select.

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux