I've got a large (and growing) database set up as a
partitioned database. The partitions are physically broken out by state
plus a unique id for each. There's roughly 20 million records in the
whole thing just now. My question is, why does a simple query supplying
both parts of the index key work nearly instantly as expected when I
submit it to the appropriate partition table directly, but the same
query when sent to the master table takes nearly 3/4 of a minute to
return one record? Actually, running the queries with 'Explain analyze
verbose' tells me what it chose, so I know it's slopw because it chose
to do a sequential scan on the master table but what I'd like to know
is why does it take so long to go through the master table looking for
the partition for 'co'? According to the log, if I read it correctly,
it took nearly 40 seconds just to scan through the 19 partition tables
before it found the colorado partition. Can soeone tell me ho wot speed
up that step? Also (this might be relevant) I accidentally got 15 million records into the master table earlier, but those have all been deleted and I've run VACUUM ANALYZE on the master table since then. The following shows the details and the environment. I'm using PostgreSQL 8.4.5 and running on CentOS 5.5 This is the master table. It has no records or indexes as per the PG manual. fsa=# \d clu Table "vfm.clu" Column | Type | Modifiers -------------+---------------+----------- ogc_fid | bigint | not null geom | geometry | comments | character(80) | statecd | character(2) | countycd | character(3) | tractnbr | character(7) | farmnbr | character(7) | clunbr | numeric(7,0) | acres | numeric(8,2) | fsa_acres | numeric(8,2) | heltypecd | character(1) | cluclscd | numeric(2,0) | cluid | character(36) | admnstate | character(2) | admncounty | character(3) | source_disc | character(2) | not null This is one of the partition tables. It has the same structure, although the key fields are not in the same order as the master table. It is also indexed on source_disc + ogc_fid (and spatially as well). Its constraint is that only records with 'co' in the source_disk attribute can be added or found here. fsa=# \d clu_co Table "vfm.clu_co" Column | Type | Modifiers -------------+---------------+----------- geom | geometry | comments | character(80) | statecd | character(2) | countycd | character(3) | tractnbr | character(7) | farmnbr | character(7) | clunbr | numeric(7,0) | acres | numeric(8,2) | fsa_acres | numeric(8,2) | heltypecd | character(1) | cluclscd | numeric(2,0) | cluid | character(36) | admnstate | character(2) | admncounty | character(3) | ogc_fid | bigint | not null source_disc | character(2) | not null Indexes: "clu_co_pkey" PRIMARY KEY, btree (source_disc, ogc_fid) "clu_co_geom" gist (geom) Check constraints: "cd_id" CHECK (source_disc = 'co'::bpchar) Inherits: clu Here's the query that executes quickly in the partition table. Notice that it's using the index for a fast lookup. fsa=# explain analyze verbose select :flds from clu_co where source_disc='co' and ogc_fid = 116337; QUERY PLAN ---------------------------------------------------------------------------------------------------------- ----------- Index Scan using clu_co_pkey on clu_co (cost=0.00..8.31 rows=1 width=48) (actual time=0.079..0.086 rows= 1 loops=1) Output: source_disc, ogc_fid, statecd, countycd, tractnbr, farmnbr, clunbr, acres Index Cond: ((source_disc = 'co'::bpchar) AND (ogc_fid = 116337)) Total runtime: 0.177 ms (4 rows) fsa=# select :flds from clu where source_disc='co' and ogc_fid = 116337; source_disc | ogc_fid | statecd | countycd | tractnbr | farmnbr | clunbr | acres -------------+---------+---------+----------+----------+---------+--------+--------- co | 116337 | 08 | 043 | 0000533 | 0000065 | 9 | 4677.79 (1 row) The same query when sent through the master table. Notice it's using a sequential scan. But why does this operation take 38 seconds? How do I speed that up? fsa=# explain analyze verbose select :flds from clu where source_disc='co' and ogc_fid = 116337; QUERY PLAN ---------------------------------------------------------------------------------------------------------- Result (cost=0.00..1098364.31 rows=2 width=52) (actual time=38367.332..38367.355 rows=1 loops=1) Output: vfm.clu.source_disc, vfm.clu.ogc_fid, vfm.clu.statecd, vfm.clu.countycd, vfm.clu.tractnbr, vfm.clu.farmnbr, vfm.clu.clunbr, vfm.clu.acres -> Append (cost=0.00..1098364.31 rows=2 width=52) (actual time=38367.325..38367.339 rows=1 loops=1) -> Seq Scan on clu (cost=0.00..1098356.00 rows=1 width=57) (actual time=38367.222..38367.222 rows=0 loops=1) Output: vfm.clu.source_disc, vfm.clu.ogc_fid, vfm.clu.statecd, vfm.clu.countycd, vfm.clu.tractnbr, vfm.clu.farmnbr, vfm.clu.clunbr, vfm.clu.acres Filter: ((source_disc = 'co'::bpchar) AND (ogc_fid = 116337)) -> Index Scan using clu_co_pkey on clu_co clu (cost=0.00..8.31 rows=1 width=48) (actual time=0.090..0.096 rows=1 loops=1) Output: vfm.clu.source_disc, vfm.clu.ogc_fid, vfm.clu.statecd, vfm.clu.countycd, vfm.clu.tractnbr, vfm.clu.farmnbr, vfm.clu.clunbr, vfm.clu.acres Index Cond: ((source_disc = 'co'::bpchar) AND (ogc_fid = 116337)) Total runtime: 38367.450 ms (10 rows) TIA, - Bill --
Bill Thoen |