Re: Strange (?) Index behavior?

Allen Landsidel <alandsidel@xxxxxxxxx> · Fri, 12 Nov 2004 19:26:39 -0500

On Fri, 12 Nov 2004 17:35:00 -0500, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> 
> 
> Allen Landsidel <alandsidel@xxxxxxxxx> writes:
> > On Thu, 11 Nov 2004 16:41:51 -0500, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
> >> Allen Landsidel <alandsidel@xxxxxxxxx> writes:
> >>> Clustering is really unworkable in this situation.
> >>
> >> Nonetheless, please do it in your test scenario, so we can see if it has
> >> any effect or not.
> 
> > It did not, not enough to measure anyway, which does strike me as
> > pretty odd.
> 
> Me too.  Maybe we are barking up the wrong tree entirely, because I
> really expected to see a significant change.
> 
> Lets start from first principles.  While you are running this query,
> what sort of output do you get from "vmstat 1"?  I'm wondering if it's
> I/O bound or CPU bound ...

I am running systat -vmstat 1 constantly on the box.. it's almost
always I/O bound.. and the numbers are far lower than what I expect
them to be, under 1MB/s.  bonnie++ shows decent scores so.. I'm not
sure what's goin on.

[allen@dbtest01 /mnt_db/work#]bonnie++ -d /mnt_db/work -c 2 -u nobody
Using uid:65534, gid:65534.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.93c       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   2     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
dbtest01.distr 300M   100  98 17426  21 17125  18   197  98 182178  99  2027  42
Latency             96208us     594ms     472ms   56751us   15691us    3710ms
Version 1.93c       ------Sequential Create------ --------Random Create--------
dbtest01.distribute -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 12932  90 +++++ +++ 20035  98 11912  91 +++++ +++ 13074  93
Latency             26691us     268us   18789us   26755us   13586us   25039us
1.93c,1.93c,dbtest01.distributedmail.com,2,1100269160,300M,,100,98,17426,21,17125,18,197,98,182178,99,2027,42,16,,,,,12932,90,+++++,+++,20035,98,11912,91,+++++,+++,13074,93,96208us,594ms,472ms,56751us,15691us,3710ms,26691us,268us,18789us,26755us,13586us,25039us

Looking at these numbers, obviously things could be a bit quicker, but
it doesn't look slow enough to my eyes or experience to account for
what I'm seeing with the query performance..

During the query, swap doesn't get touched, the cpus are mostly idle,
but the disk activity seems to be maxed at under 1MB/s, 100% busy.

To refresh and extend..
The box is FreeBSD 4.10-STABLE
Dual 800MHz PIII's, 2GB of memory

Relevent kernel options:

maxusers        512
...
options         SYSVSHM
options         SHMMAXPGS=262144
options         SHMSEG=512
options         SHMMNI=512
options         SYSVSEM
options         SEMMNI=512
options         SEMMNS=1024
options         SEMMNU=512
options         SEMMAP=512

...

nothing custom going on in /etc/sysctl.conf

Filesystem is..
/dev/da1s1e on /mnt_db (ufs, local, noatime, soft-updates)

And, from my postgresql.conf..

shared_buffers = 32768          # min 16, at least max_connections*2, 8KB each
sort_mem = 65536                # min 64, size in KB
vacuum_mem = 65536              # min 1024, size in KB
...
max_fsm_pages = 20000           # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1000        # min 100, ~50 bytes each
...
stats_start_collector = true
stats_command_string = true
stats_block_level = false
stats_row_level = true
stats_reset_on_server_start = true

Thanks for helping me out with this Tom and everyone else.  I suppose
it's possible that something could be physically wrong with the drive,
but I'm not seeing anything in syslog.  I'm going to poke around with
camcontrol looking for any bad sectors / remapped stuff while I wait
for replies.

-Allen