Re: Large DB

Ericson Smith <eric@did-it.com> · Tue, 06 Apr 2004 09:27:04 -0400

I've been following this thread with interest since it started, and it 
really seems that there is just too much data in that single table. When 
it comes down to it, making smaller separate tables seems to be the way 
to go. Querying will be a little harder, but much faster.

Warmest regards, 
Ericson Smith

Tracking Specialist/DBA

+-----------------------+---------------------------------+

| http://www.did-it.com | "When you have to shoot, shoot, |

| eric@did-it.com       | don't talk!             - Tuco  |

| 516-255-0500          |                                 |

+-----------------------+---------------------------------+ 

Manfred Koizar wrote:

On Sat, 03 Apr 2004 22:39:31 -0800, "Mooney, Ryan" <ryan.mooney@pnl.gov>

wrote:

Ok, so I ran a vacuum analyse.  It took ~1.7 days to finish.

Just to make it clear:  VACUUM and ANALYSE are two different commands.  

VACUUM is for cleaning up.  It has to visit every tuple in every page,
and if there are dead row versions, it also has to scan all indices
belonging to the table.  If there are lots of deleted rows and
vacuum_mem is to small, VACUUM has to scan each index several times to
delete all index entries pointing to dead rows.  This might raise the
cost to even more than O(tuplecount).

ANALYSE collects a random sample of the rows in the table, the sample
size depends on default_statistics_target and the maximum value you have
set any column's statistics target to (ALTER TABLE ... ALTER COLUMN ...
SET STATISTICS ...).  If you didn't touch either, the sample size is
3000 rows.  Then these 3000 rows are sorted and counted in different
ways to generate statistics.

The number of pages that have to be touched to collect the sample
depends on the table size, but it does by far not grow proportionally to
the number of pages, nblocks.  The cost growth rate is greater than
O(ln(nblocks)) and significantly lesser than O(nblocks).  I have no
simple formula for it, but I estimate that analysing your tp3 table
would need between 28000 and 30000 page reads, which should be doable in
a few minutes.

VACUUM ANALYSE is just VACUUM followed by ANALYSE with the added
benefit, that the number of rows does not have to be estimated by
ANALYSE, because VACUUM knows the exact value.

The invalid page block was caused when I tried the 2.6 kernel 
(for other reasons than DB performance), its been there for a while, 
and I can deal w/ the data loss

ERROR:  invalid page header in block 10257032 of "tp3_point_starttime"

AFAICS the invalid page is in an index, so there is no data loss.  You
could simply drop and re-create that index.  That might take some time,
though :-(

Here is the explain analyse, you can see why I think that an index on

just host might be

better - hosts are a small set, starttime is a large set so the index

should be more 
Efficient,

I think you got that backwards.  If there are not many hosts, then an
index on host is not very selective, IOW you get a lot of hits when you
look for a particular host.  OTOH if you select a sufficiently small
starttime interval, you get only a few rows, so using an index is most
efficient.

at the very least it should be (starttime, host), not (host,

starttime) unless

the indexing engine is smart enough to make that not matter (I suspect

its not???).

Yes, it should be (starttime, host).  And no, PG is generally not smart
enough to use an index if there is no condition on the first index
column.

 ->  Index Scan using tp3_host_starttime, tp3_host_starttime,

     [...], tp3_host_starttime on tp3

(cost=0.00..195.07 rows=1 width=25) (actual time=59.42..96366.43

rows=206238 loops=1)

Estimated number of rows: 1
Actual number of rows: 206238
The planner is way off here.  Furtunately your huge number of rows makes
it rule out every other (most probably slower) plan.

How many hosts are there?  Even if there are a few hundred, an index
scan with that condition has to access and skip millions of index
tuples.  An index on (starttime, host) would visit less index tuples,
and would more likely access the heap tuples in physical order.

Having followed the ongoing discusion about this I can concur that it is

definitely NOT O(1).  Unfortunately I didn't track the "time to vacuum"

The discussion on -hackers and the patch I posted earlier today are
about ANALYSE, not VACUUM.

However I believe that I'm going to follow the suggestions

about reducing the table size so I'll have a brand new BD to play with

in a couple weeks,

Hopefully we'll see a success story here.

so knowing what I know now, I can track that if anyones 
interested in the data besides me :)

VACUUM and ANALYSE times?  Certainly.

Servus
Manfred

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
     subscribe-nomail command to majordomo@postgresql.org so that your
     message can get through to the mailing list cleanly

begin:vcard
fn:Ericson Smith
n:Smith;Ericson
org:Did-it.com;Programming
adr:#304;;55 Maple Avenue;Rockville Center;NY;11570;USA
email;internet:eric@did-it.com
title:Web Developer
tel;work:516-255-0500
tel;cell:646-483-3420
note:Nothing special!
x-mozilla-html:FALSE
url:http://www.did-it.com
version:2.1
end:vcard

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match