On Fri, 12 Nov 2004 17:35:00 -0500, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > > > Allen Landsidel <alandsidel@xxxxxxxxx> writes: > > On Thu, 11 Nov 2004 16:41:51 -0500, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > >> Allen Landsidel <alandsidel@xxxxxxxxx> writes: > >>> Clustering is really unworkable in this situation. > >> > >> Nonetheless, please do it in your test scenario, so we can see if it has > >> any effect or not. > > > It did not, not enough to measure anyway, which does strike me as > > pretty odd. > > Me too. Maybe we are barking up the wrong tree entirely, because I > really expected to see a significant change. > > Lets start from first principles. While you are running this query, > what sort of output do you get from "vmstat 1"? I'm wondering if it's > I/O bound or CPU bound ... I am running systat -vmstat 1 constantly on the box.. it's almost always I/O bound.. and the numbers are far lower than what I expect them to be, under 1MB/s. bonnie++ shows decent scores so.. I'm not sure what's goin on. [allen@dbtest01 /mnt_db/work#]bonnie++ -d /mnt_db/work -c 2 -u nobody Using uid:65534, gid:65534. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version 1.93c ------Sequential Output------ --Sequential Input- --Random- Concurrency 2 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP dbtest01.distr 300M 100 98 17426 21 17125 18 197 98 182178 99 2027 42 Latency 96208us 594ms 472ms 56751us 15691us 3710ms Version 1.93c ------Sequential Create------ --------Random Create-------- dbtest01.distribute -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 12932 90 +++++ +++ 20035 98 11912 91 +++++ +++ 13074 93 Latency 26691us 268us 18789us 26755us 13586us 25039us 1.93c,1.93c,dbtest01.distributedmail.com,2,1100269160,300M,,100,98,17426,21,17125,18,197,98,182178,99,2027,42,16,,,,,12932,90,+++++,+++,20035,98,11912,91,+++++,+++,13074,93,96208us,594ms,472ms,56751us,15691us,3710ms,26691us,268us,18789us,26755us,13586us,25039us Looking at these numbers, obviously things could be a bit quicker, but it doesn't look slow enough to my eyes or experience to account for what I'm seeing with the query performance.. During the query, swap doesn't get touched, the cpus are mostly idle, but the disk activity seems to be maxed at under 1MB/s, 100% busy. To refresh and extend.. The box is FreeBSD 4.10-STABLE Dual 800MHz PIII's, 2GB of memory Relevent kernel options: maxusers 512 ... options SYSVSHM options SHMMAXPGS=262144 options SHMSEG=512 options SHMMNI=512 options SYSVSEM options SEMMNI=512 options SEMMNS=1024 options SEMMNU=512 options SEMMAP=512 ... nothing custom going on in /etc/sysctl.conf Filesystem is.. /dev/da1s1e on /mnt_db (ufs, local, noatime, soft-updates) And, from my postgresql.conf.. shared_buffers = 32768 # min 16, at least max_connections*2, 8KB each sort_mem = 65536 # min 64, size in KB vacuum_mem = 65536 # min 1024, size in KB ... max_fsm_pages = 20000 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 1000 # min 100, ~50 bytes each ... stats_start_collector = true stats_command_string = true stats_block_level = false stats_row_level = true stats_reset_on_server_start = true Thanks for helping me out with this Tom and everyone else. I suppose it's possible that something could be physically wrong with the drive, but I'm not seeing anything in syslog. I'm going to poke around with camcontrol looking for any bad sectors / remapped stuff while I wait for replies. -Allen