Re: Paged Query

Greg Spiegelberg <gspiegelberg@xxxxxxxxx> · Mon, 9 Jul 2012 08:33:36 -0600

On Mon, Jul 9, 2012 at 8:16 AM, Craig James <cjames@xxxxxxxxxxxxxx> wrote:

A good solution to this general problem is "hitlists."  I wrote about this concept before:

http://archives.postgresql.org/pgsql-performance/2010-05/msg00058.php

I implemented  this exact strategy in our product years ago.  Our queries were once quite complicated involving many nested sub-SELECT's and several JOIN's per SELECT.  The basics of our implementation now consists of

 1. A table tracking all "cache" tables.  A cache table is a permanent table once represented as one of the former sub-SELECT's.  The table includes the MD5 hash of the query used to create the table, time created, query type (helps to determine expire time), and a comment field to help in debugging.
 2. Simple logic checking for the existence of the cache table and creating it if it does not.
 3. Using one or many of the named cache tables in the final query using ORDER BY-LIMIT-OFFSET in a CURSOR.
 4. One scheduled backend process to clear the "expired" cache tables based on the query type.

Reason for the CURSOR is to execute once to get a tally of records for pagination purposes then rewind and fetch the right "page".

Highly recommended.

-Greg