venu madhav <venutaurus539@xxxxxxxxx> wrote: > Kevin Grittner <Kevin.Grittner@xxxxxxxxxxxx wrote: >> > I calculate the count first. >> >> This and other comments suggest that the data is totally static >> while this application is running. Is that correct? >> > No, the data gets added when the application is running. As I've > mentioned before it could be as faster as 100-400 records per > second. And it is an important application which will be running > 24/7. Then how can you trust that the count you run before selecting is accurate when you run the SELECT? Are they both in the same REPEATABLE READ or SERIALIZABLE transaction? >> Also, you didn't address the issue of storing enough information >> on the page to read off either edge in the desired sequence with >> just a LIMIT and no offset. "Last page" or "page up" would need >> to reverse the direction on the ORDER BY. This would be very >> fast if you have appropriate indexes. Your current technique can >> never be made very fast. >> > I actually didn't understand what did you mean when you said > "storing enough information on the page to read off either edge in > the desired sequence with just a LIMIT and no offset". What kind > of information can we store to improve the performance. Well, for starters, it's entirely possible that the "hitlist" approach posted by Craig James will work better for you than what I'm about to describe. Be sure to read this post carefully: http://archives.postgresql.org/pgsql-performance/2010-05/msg00058.php The reason that might work better than the idea I was suggesting is that the combination of selecting on timestamp and ordering by something else might make it hard to use reasonable indexes to position and limit well enough for the technique I was suggesting to perform well. It's hard to say without testing. For what I was describing, you must use an ORDER BY which guarantees a consistent sequence for the result rows. I'm not sure whether you always have that currently; if not, that's another nail in the coffin of your current technique, since the same OFFSET into the result might be different rows from one time to the next, even if data didn't change. If your ORDER BY can't guarantee a unique set of ordering values for every row in the result set, you need to add any missing columns from a unique index (usually the primary key) to the ORDER BY clause. Anyway, once you are sure you have an ORDER BY which is deterministic, you make sure your software remembers the ORDER BY values for the first and last entries on the page. Then you can do something like (abstractly): SELECT x, y, z FROM a, b WHERE ts BETWEEN m AND n AND a.x = b.a_x AND (x, y) > (lastx, lasty) ORDER BY x, y LIMIT 20; With the right indexes, data distributions, selection criteria, and ORDER BY columns -- that *could* be very fast. If not, look at Craig's post. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance