Clemens Park wrote: > Recently, during a performance improvement sweep for an application at my company, one of the hotspots > that was discovered was pagination. > > In order to display the correct pagination links on the page, the pagination library we used (most > pagination libraries for that matter) ran the query with OFFSET and LIMIT to get the paginated > results, and then re-ran the query without the OFFSET and LIMIT and wrapped them in a SELECT COUNT(*) > FROM main_query to get the total number of rows. > > In an attempt to optimize this, we used a window function as follows: > > Given a query that looked as follows: > > SELECT a,b,c > FROM table > WHERE clauses > OFFSET x LIMIT y; > > add total_entries_count column as follows: > > SELECT COUNT(table.id) OVER () AS total_entries_count, a,b,c > FROM table > WHERE clauses > OFFSET x LIMIT y; > > This calculates the total number of unpaginated rows correctly, without affecting the runtime of the > query. At least as far as I can tell. It can affect the runtime considerably. I created a 10^6 row test table and tried: test=> EXPLAIN ANALYZE SELECT id, val FROM large OFFSET 100 LIMIT 10; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------ Limit (cost=1.49..1.64 rows=10 width=12) (actual time=0.177..0.195 rows=10 loops=1) -> Seq Scan on large (cost=0.00..14902.00 rows=1000000 width=12) (actual time=0.028..0.114 rows=110 loops=1) Total runtime: 0.251 ms (3 rows) test=> EXPLAIN ANALYZE SELECT id, val, COUNT(*) OVER () AS total_entries_count FROM large OFFSET 100 LIMIT 10; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------ Limit (cost=2.74..3.01 rows=10 width=12) (actual time=1893.606..1893.625 rows=10 loops=1) -> WindowAgg (cost=0.00..27402.00 rows=1000000 width=12) (actual time=1893.435..1893.559 rows=110 loops=1) -> Seq Scan on large (cost=0.00..14902.00 rows=1000000 width=12) (actual time=0.025..647.182 rows=1000000 loops=1) Total runtime: 1915.255 ms (4 rows) That is because the second query will have to scan all rows, while the first one can stop scanning after 110 rows. > The questions I have are: > > 1) Are there any adverse effects that the above window function can have? I can only think of the performance degradation mentioned above. > 2) Are there any cases where the count would return incorrectly? No. > 3) In general, is this an appropriate use-case for using window functions? I think it is. Maybe you can do better if you don't retrieve the total count of rows for every set of rows you select. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general