Re: Slow count(*) again...

Scott Carey <scott@xxxxxxxxxxxxxxxxx> · Mon, 11 Oct 2010 19:02:51 -0700

On Oct 10, 2010, at 10:14 AM, Mladen Gogala wrote:

> 
> SQL> show parameter db_file_multi
> 
> NAME                                 TYPE        VALUE
> ------------------------------------ ----------- 
> ------------------------------
> db_file_multiblock_read_count        integer     16
> SQL> alter session set db_file_multiblock_read_count=1;
> 
> Session altered.
> SQL> select count(*) from ni_occurrence;
> 
>  COUNT(*)
> ----------
> 402062638
> 
> Elapsed: 00:08:20.88
> SQL> alter session set db_file_multiblock_read_count=128;
> 
> Session altered.
> 
> Elapsed: 00:00:00.50
> SQL>  select count(*) from ni_occurrence;
> 
>  COUNT(*)
> ----------
> 402062638
> 
> Elapsed: 00:02:17.58
> 
> 
> In other words, when I batched the sequential scan to do 128 blocks I/O, 
> it was 4 times faster than when I did the single block I/O.
> Does that provide enough of an evidence and, if not, why not?
> 

Did you tune the linux FS read-ahead first?  You can get large gains by doing that if you are on ext3.
blockdev --setra 2048 <device>

would give you a 1MB read-ahead.  Also, consider XFS and its built-in defragmentation.  I have found that a longer lived postgres DB will get extreme 
file fragmentation over time and sequential scans end up mostly random.  On-line file defrag helps tremendously.

> It maybe so, but slow sequential scan is still the largest single 
> performance problem of PostgreSQL. The frequency with which that topic 
> appears on the mailing lists should serve as a good evidence for that. I 
> did my best to prove my case.  

I'm not sure its all the I/O however.  It seems that Postgres uses a lot more CPU than other DB's to crack open a tuple and inspect it.  Testing on unindexed tables with count(*) I can get between 200MB and 800MB per second off disk max with full cpu utilization (depending on the average tuple size and contents).   This is on a disk array that can do 1200MB/sec.   It always feels dissapointing to not be able to max out the disk throughput on the simplest possible query.  

> Again, requiring "hard numbers" when 
> using the database which doesn't allow tweaking of the I/O size is self 
> defeating proposition. The other databases, like DB2 and Oracle both 
> allow tweaking of that aspect of its operation, Oracle even on the per 
> session basis. If you still claim that it wouldn't make the difference, 
> the onus to prove it is on you.
> 
> -- 
> Mladen Gogala 
> Sr. Oracle DBA
> 1500 Broadway
> New York, NY 10036
> (212) 329-5251
> www.vmsinfo.com 
> 
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance