On 02/15/2014 10:31 AM, Antman, Jason (CMG-Atlanta) wrote:
Well thanks for someone at least sending a reply, though I suppose I
should have asked "how do I do this", or "what are the major hurdles to
doing this", as it obviously has to be *possible* given unlimited
knowledge, resources and time.
Perhaps I should frame the question differently:
If you had a single ~1TB database, and needed to be able to give fresh
data copies to dev/test environments (which are usually largely idle)
either on demand or daily, how would you do it? The only other thing
that comes to mind is separate postgres instances (running multiple
postgres instances per server?), one per database, for every
environment. Which means that if 80% of the environments are idle at a
given time, I'm effectively wasting 80% of the memory that I have
allocated to shared buffers, etc. and I actually need 4x the resources
I'm using? Unless postgres supports balooning of memory?
<<Thinking out loud>>
Your requirements:
If I am following correctly you want a single universal data set that is
accessible by multiple databases at will. There will be 100-150
different databases to which presumably different things are being done.
With, ideally, the databases having different names. Also it would be
helpful if the data could be refreshed without stopping the cluster.
The process:
Start at known point, the state of the production server at point in time.
Apply that state as the starting point for 100-150 databases.
Do things to those 100-150 databases that cause them to deviate from the
starting point and each other.
Periodically reset some portion of the databases to a new starting point
based on a different state of the production database.
The issue:
The killer would seem to be the global(cluster)/local(database) problem.
The cluster has the database at one state and then you try impose a
database state from another cluster on it. Even given your
stipulation-.."to be *possible* given unlimited knowledge, resources and
time.", I would say not in a time frame that is going to help you in the
practical future.
Solution or not;
Not seeing how you can do it without using some variation of what you do
now. Part of the problem in trying to come up with a solution is not
knowing what is being done to the test/dev databases.
Is the full dataset a requirement for testing or would a subset do?
Are there 100-150 users/tests each needing a database or could databases
be shared?
Is the 80% idle figure you mention manageable?
In other words do the tests run whenever, so the instances have to
always be available or can they be spun up on demand?
Thanks,
Jason
On 02/15/2014 01:20 PM, Tom Lane wrote:
"Antman, Jason (CMG-Atlanta)" <Jason.Antman@xxxxxxxxxx> writes:
Perhaps there's a postgres internals expert around, someone intimitely familiar with pg_xlog/pg_clog/pg_control, who can comment on whether it's possible to take the on-disk files from a single database in a single tablespace, and make them usable by a different postgres server, running multiple databases?
It is not. There's no need for detailed discussion.
regards, tom lane
--
Adrian Klaver
adrian.klaver@xxxxxxxxx
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general