I have an unique requirement. I have a feed of 2.5 - 3 million rows of
data which arrives every 1/2 an hour. Each row has 2 small string
values (about 50 chars each) and 10 int values. I need
searcheability and running arbitrary queries on any of these values.
This means i have to create an index on every column. The feed comes in
as a text file comma separated. Here is what i am planning to do
1) create a new table every time a new feed file comes in. Create table
with indexes. Use the copy command to dump the data into the table.
2) rename the current table to some old table name and rename the new
table to current table name so that applications can access them
directly.
Note that these are read only tables and it is fine if the step 2 takes
a small amount of time (it is not a mission critical table hence, a
small downtime of some secs is fine).
My question is what is the best way to do step (1) so that after the
copy is done, the table is fully indexed and properly balanced
and optimized for query.
Should i create indexes before or after import ? I need to do this in
shortest period of time so that the data is always uptodate. Note that
incremental updates are not possible since almost every row will be
changed in the new file.
my table creation script looks like this
create table datatablenew(fe varchar(40), va varchar(60), a int, b int,
c int, d int, e int, f int, g int, h int, i int, j int, k int, l int, m
int, n int, o int, p int, q real);
create index fe_idx on datatablenew using hash (fe);
create index va_idx on datatablenew using hash(va);
create index a_idx on datatablenew (a);
......
create index q_idx on datatablenew(q);
please advice.
thanks
vijay