Re: Slow count(*) again...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Neil Whelchel wrote:
I know that there haven been many discussions on the slowness of count(*) even when an index is involved because the visibility of the rows has to be checked. In the past I have seen many suggestions about using triggers and tables to keep track of counts and while this works fine in a situation where you know what the report is going to be ahead of time, this is simply not an option when an unknown WHERE clause is to be used (dynamically generated). I ran into a fine example of this when I was searching this mailing list, "Searching in 856,646 pages took 13.48202 seconds. Site search powered by PostgreSQL 8.3." Obviously at some point count(*) came into play here because the site made a list of pages (1 2 3 4 5 6 > next). I very commonly make a list of pages from search results, and the biggest time killer here is the count(*) portion, even worse yet, I sometimes have to hit the database with two SELECT statements, one with OFFSET and LIMIT to get the page of results I need and another to get the amount of total rows so I can estimate how many pages of results are available. The point I am driving at here is that since building a list of pages of results is such a common thing to do, there need to be some specific high speed ways to do this in one query. Maybe an estimate(*) that works like count but gives an answer from the index without checking visibility? I am sure that this would be good enough to make a page list, it is really no big deal if it errors on the positive side, maybe the list of pages has an extra page off the end. I can live with that. What I can't live with is taking 13 seconds to get a page of results from 850,000 rows in a table.
-Neil-

Unfortunately, the problem is in the rather primitive way PostgreSQL does I/O. It didn't change in 9.0 so there is nothing you could gain by upgrading. If you execute strace -o /tmp/pg.out -e read <PID of the sequential scan process> and inspect the file /tmp/pg.out when the query finishes, you will notice a gazillion of read requests, all of them 8192 bytes in size. That means that PostgreSQL is reading the table block by block, without any merging of the requests. You can alleviate the pain by using the OS tricks, like specifying the deadline I/O scheduler in the grub.conf and set prefetch on the FS block devices by using blockdev, but there is nothing special that can be done, short of rewriting the way PostgreSQL does I/O. There were rumors about the version 9.0 and asynchronous I/O, but that didn't materialize. That is really strange to me, because PostgreSQL tables are files or groups of files, if the table size exceeds 1GB. It wouldn't be very hard to try reading 1MB at a time and that would speed up the full table scan significantly. Problem with single block I/O is that there is a context switch for each request, the I/O scheduler has to work hard to merge requests appropriately and there is really no need for that, tables are files navigating through files is not a problem, even with much larger blocks. In another database, whose name I will not mention, there is a parameter db_file_multiblock_read_count which specifies how many blocks will be read by a single read when doing a full table scan. PostgreSQL is in dire need of something similar and it wouldn't even be that hard to implement.


--
Mladen Gogala Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux