Re: Managing two sets of data in one database

John R Pierce <pierce@xxxxxxxxxxxx> · Thu, 29 Mar 2012 10:54:31 -0700

On 03/29/12 9:43 AM, Jonathan Bartlett wrote:
1) A large (~150GB) dataset.  This data set is mainly static.  It is 
updated, but not by the users (it is updated by our company, which 
provides the data to users).  There are some deletions, but it is safe 
to consider this an "add-only" database, where only new records are 
created.
2) A small (~10MB but growing) dataset.  This is the user's data.  It 
includes many bookmarks (i.e. foreign keys) into data set #1. 
 However, I am not explicitly using any referential integrity system.

by 'dataset' do you mean table, aka relation ?

by 'not using any referential integrity', do you mean, you're NOT using 
foreign keys ('REFERENCES table(field)' in your table declaration ?

Also, many queries cross the datasets together.

by 'cross', do you mean JOIN  ?

Now, my issue is that right now when we do updates to the dataset, we 
have to make them to the live database.  I would prefer to manage data 
releases the way we manage software releases - have a staging area, 
test the data, and then deploy it to the users.  However, I am not 
sure the best approach for this.  If there weren't lots of crossover 
queries, I could just shove them in separate databases, and then swap 
out dataset #1 when we have a new release.

you can't JOIN data across relations(tables) in different databases.

--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general