Re: Query slows after offset of 100K

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 14 Feb 2008, Michael Lorenz wrote:
When offsetting up to about 90K records, the EXPLAIN ANALYZE is similar to the following:
Limit  (cost=15357.06..15387.77 rows=20 width=35) (actual time=19.235..19.276 rows=20 loops=1)
  ->  Index Scan using account_objectname on "object" o  (cost=0.00..1151102.10 rows=749559 width=35) (actual time=0.086..14.981 rows=10020 loops=1)
        Index Cond: (accountid = 354)
        Filter: ((NOT deleted) OR (deleted IS NULL))
Total runtime: 19.315 ms

Since this is scanning through 10,000 random rows in 19 milliseconds, I say all this data is already in the cache. If it wasn't, you'd be looking at 10,000 random seeks on disk, at about 7ms each, which is 70 seconds. Try dropping the OS caches (on Linux echo "1" >/proc/sys/vm/drop_caches) and see if the performance is worse.

If I move the offset up to 100K records or higher, I get:
Limit  (cost=145636.26..145636.31 rows=20 width=35) (actual time=13524.327..13524.355 rows=20 loops=1)
  ->  Sort  (cost=145386.26..147260.16 rows=749559 width=35) (actual time=13409.216..13481.793 rows=100020 loops=1)
        Sort Key: objectname
        ->  Seq Scan on "object" o  (cost=0.00..16685.49 rows=749559 width=35) (actual time=0.011..1600.683 rows=749549 loops=1)
              Filter: (((NOT deleted) OR (deleted IS NULL)) AND (accountid = 354))
Total runtime: 14452.374 ms

And here, it only takes 1.5 seconds to fetch the entire table from disc (or it's already in the cache or something), but 14 seconds to sort the whole lot in memory.

In any case, Postgres is making a good choice - it's just that you have an unexpected benefit in the first case that the data is in cache. Setting the effective cache size correctly will help the planner in this case. Setting work_mem higher will improve the performance of the sort in the second case.

Of course, what others have said about trying to avoid large offsets is good advice. You don't actually need a unique index, but it makes it simpler if you do.

Matthew

--
The early bird gets the worm. If you want something else for breakfast, get
up later.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux