On 03/29/12 9:43 AM, Jonathan Bartlett wrote:
1) A large (~150GB) dataset. This data set is mainly static. It is
updated, but not by the users (it is updated by our company, which
provides the data to users). There are some deletions, but it is safe
to consider this an "add-only" database, where only new records are
created.
2) A small (~10MB but growing) dataset. This is the user's data. It
includes many bookmarks (i.e. foreign keys) into data set #1.
However, I am not explicitly using any referential integrity system.
by 'dataset' do you mean table, aka relation ?
by 'not using any referential integrity', do you mean, you're NOT using
foreign keys ('REFERENCES table(field)' in your table declaration ?
Also, many queries cross the datasets together.
by 'cross', do you mean JOIN ?
Now, my issue is that right now when we do updates to the dataset, we
have to make them to the live database. I would prefer to manage data
releases the way we manage software releases - have a staging area,
test the data, and then deploy it to the users. However, I am not
sure the best approach for this. If there weren't lots of crossover
queries, I could just shove them in separate databases, and then swap
out dataset #1 when we have a new release.
you can't JOIN data across relations(tables) in different databases.
--
john r pierce N 37, W 122
santa cruz ca mid-left coast
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general