Re: Tuning to speed select

"Merlin Moncure" <mmoncure@xxxxxxxxx> · Fri, 11 Aug 2006 10:41:56 -0400

On 8/11/06, Tom Laudeman <twl8n@xxxxxxxxxxxx> wrote:
Merlin,
The problem is disk. I've got a WD Caviar. hdparm says it does 44MB/sec
(I ran that in single user mode so there was nothing interfering). A WD
Caviar SE SATA in one of my servers at home gets 56MB/sec on a quiescent
system at runlevel 3. What kind of values does hdparm give for a SATA
Raptor?

i dont have one handy on a linux box to test, but all raptors are 10k
drives with 16 mb cache. they feel about twice as fast as 7200 rpm ide
drives in general use.

I think my Dell Precision 650 has SATA on the motherboard. The boss says
I can order one drive, so what should I get? How much faster is RAID 0+1
than a single drive?

depends on a lot of factors, near 100% improvement is realistic even
with software raid.  I would tell your boss that you could buy 2 36g
raptors (110$ each) do a simple raid 0.  just be aware that either
drive failing will take you out.  or, you could do raid 1 for
redundancy.

Aside from size, I can't see much difference between these drives (WD
Raptors at NewEgg):
http://www.newegg.com/Product/ProductList.asp?DEPA=0&type=&Description=raptor&Submit=ENE&Ntk=all&N=0&minPrice=&maxPrice=&Go.x=0&Go.y=0

the retail parts are more expensive as is the silly drive that you can
look into.  buy the cheapest part at the size level you need.

CLUSTER certainly helped. Each of the following queries would have
returned roughly 50,000 records. Note that selecting a single record
from blast_result using an index is plenty fast ( ~ 50 ms), so my
primary concern is pulling back larger subsets of data.

maybe. you may have had table bloat as well, cluster does a full table
rebuild like vacuum fuul.

It appears that count(*) on a CLUSTERed table uses the index (as opposed
to the old way of doing a sequential scan). Count on the table after
CLUSTER appears to be a *lot* faster, maybe almost 100x. I know we
shouldn't count, but we've been too lazy to keep the record counts in
another table, and our customers occasionally want to know how many
records are in a certain subset.

no, afaik count(*) uses the table still (try explain analyze). you
just compacted and optimized the table for efficient sequential scans.
are you vacuuming regulary?

merlin