On Oct 11, 2010, at 7:02 PM, Scott Carey wrote: > > On Oct 10, 2010, at 10:14 AM, Mladen Gogala wrote: > >> >> SQL> show parameter db_file_multi >> >> NAME TYPE VALUE >> ------------------------------------ ----------- >> ------------------------------ >> db_file_multiblock_read_count integer 16 >> SQL> alter session set db_file_multiblock_read_count=1; >> >> Session altered. >> SQL> select count(*) from ni_occurrence; >> >> COUNT(*) >> ---------- >> 402062638 >> >> Elapsed: 00:08:20.88 >> SQL> alter session set db_file_multiblock_read_count=128; >> >> Session altered. >> >> Elapsed: 00:00:00.50 >> SQL> select count(*) from ni_occurrence; >> >> COUNT(*) >> ---------- >> 402062638 >> >> Elapsed: 00:02:17.58 >> >> >> In other words, when I batched the sequential scan to do 128 blocks I/O, >> it was 4 times faster than when I did the single block I/O. >> Does that provide enough of an evidence and, if not, why not? >> > > Did you tune the linux FS read-ahead first? You can get large gains by doing that if you are on ext3. > blockdev --setra 2048 <device> > Scratch that, if you are using DirectIO, block read-ahead does nothing. The default is 128K for buffered I/O read-ahead. > would give you a 1MB read-ahead. Also, consider XFS and its built-in defragmentation. I have found that a longer lived postgres DB will get extreme > file fragmentation over time and sequential scans end up mostly random. On-line file defrag helps tremendously. > >> It maybe so, but slow sequential scan is still the largest single >> performance problem of PostgreSQL. The frequency with which that topic >> appears on the mailing lists should serve as a good evidence for that. I >> did my best to prove my case. > > I'm not sure its all the I/O however. It seems that Postgres uses a lot more CPU than other DB's to crack open a tuple and inspect it. Testing on unindexed tables with count(*) I can get between 200MB and 800MB per second off disk max with full cpu utilization (depending on the average tuple size and contents). This is on a disk array that can do 1200MB/sec. It always feels dissapointing to not be able to max out the disk throughput on the simplest possible query. > >> Again, requiring "hard numbers" when >> using the database which doesn't allow tweaking of the I/O size is self >> defeating proposition. The other databases, like DB2 and Oracle both >> allow tweaking of that aspect of its operation, Oracle even on the per >> session basis. If you still claim that it wouldn't make the difference, >> the onus to prove it is on you. >> >> -- >> Mladen Gogala >> Sr. Oracle DBA >> 1500 Broadway >> New York, NY 10036 >> (212) 329-5251 >> www.vmsinfo.com >> >> >> -- >> Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-performance > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance