Search Postgresql Archives

Fwd: Re: Restore postgresql data directory to tablespace on new host? Or swap tablespaces?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Forwarding back to list.


-------- Original Message --------
Subject: Re: Restore postgresql data directory to tablespace on new host? Or swap tablespaces?
Date: Sat, 15 Feb 2014 22:08:51 +0000
From: Antman, Jason (CMG-Atlanta) <Jason.Antman@xxxxxxxxxx>
To: Adrian Klaver <adrian.klaver@xxxxxxxxx>

Replies inline below.

Thanks to everyone who's responded so far. The more I explain this, and
answer questions, the more I see how my original "brilliant" idea
(multiple DBs per postgres instance on one host, instead of 1:1:1
DB:postgres:host) is insane, without some specific support for it in
postgres. So... I suppose, barring some suggestion nobody's made yet,
the best we'll be able to do is try to run multiple postgres instances
on each host, and manage the whole "service postgresql-9.0-24 stop"
craziness that comes with that...

Is anyone aware of documentation or best practices for running multiple
postgres instances on the same host, and how to maximize resource
sharing between them?

Thanks,
Jason

On 02/15/2014 04:26 PM, Adrian Klaver wrote:
On 02/15/2014 10:31 AM, Antman, Jason (CMG-Atlanta) wrote:
Well thanks for someone at least sending a reply, though I suppose I
should have asked "how do I do this", or "what are the major hurdles to
doing this", as it obviously has to be *possible* given unlimited
knowledge, resources and time.

Perhaps I should frame the question differently:

If you had a single ~1TB database, and needed to be able to give fresh
data copies to dev/test environments (which are usually largely idle)
either on demand or daily, how would you do it? The only other thing
that comes to mind is separate postgres instances (running multiple
postgres instances per server?), one per database, for every
environment. Which means that if 80% of the environments are idle at a
given time, I'm effectively wasting 80% of the memory that I have
allocated to shared buffers, etc. and I actually need 4x the resources
I'm using? Unless postgres supports balooning of memory?

<<Thinking out loud>>

Your requirements:

If I am following correctly you want a single universal data set that
is accessible by multiple databases at will. There will be 100-150
different databases to which presumably different things are being
done. With, ideally, the databases having different names. Also it
would be helpful if the data could be refreshed without stopping the
cluster.
Yup, that's the gist of it (what we do now for ~50 copies, each on their
own virtual machine).

The process:

Start at known point, the state of the production server at point in
time.

Apply that state as the starting point for 100-150 databases.

Do things to those 100-150 databases that cause them to deviate from
the starting point and each other.

Periodically reset some portion of the databases to a new starting
point based on a different state of the production database.
Yup. Exactly.

The issue:

The killer would seem to be the global(cluster)/local(database)
problem. The cluster has the database at one state and then you try
impose a database state from another cluster on it. Even given your
stipulation-.."to be *possible* given unlimited knowledge, resources
and time.", I would say not in a time frame that is going to help you
in the practical future.
Yeah, I guess. it's sounding like, aside from getting way down into the
internals of postgres and spending months coding a really bad hack that
won't work half the time, there's no sane way to stop the cluster (or
not?) and force it to throw away everything it knows about a database,
or replace that "knowledge" with fixed information... which would be
what I'd need to do, I guess.

Solution or not;

Not seeing how you can do it without using some variation of what you
do now. Part of the problem in trying to come up with a solution is
not knowing what is being done to the test/dev databases.

Is the full dataset a requirement for testing or would a subset do?
Answered in my last post... effectively, yes, because the application
ORM is so broken that nobody can get a valid-but-reduced dataset. The
last attempt at deleting old data *through* the application's ORM ended
up shrinking the DB size on disk (after vacuum) by under 50% in... I
think 2 weeks of running around the clock.
Are there 100-150 users/tests each needing a database or could
databases be shared?
That was based on 50 users each needing a maximum of 2-3 databases per
day. No sharing, as it invalidates any testing they're doing.

Is the 80% idle figure you mention manageable?
In other words do the tests run whenever, so the instances have to
always be available or can they be spun up on demand?
Well... define "on demand". Currently they're always available, except
when someone runs the command to refresh one, which currently takes
approx. 120 seconds. If we could figure out how to, perhaps, snapshot
virtual machines with the OS already booted, maybe that would work. But
given the current expectation (~2 minutes start to finish for a refresh)
I'm not sure we could reasonably spin them up on demand. Moreover, if we
were spinning up a VM on demand, we'd end up with the same situation
we're in now - every dev comes in in the morning and spins up their VM.



Thanks,
Jason

On 02/15/2014 01:20 PM, Tom Lane wrote:
"Antman, Jason (CMG-Atlanta)" <Jason.Antman@xxxxxxxxxx> writes:
Perhaps there's a postgres internals expert around, someone
intimitely familiar with pg_xlog/pg_clog/pg_control, who can
comment on whether it's possible to take the on-disk files from a
single database in a single tablespace, and make them usable by a
different postgres server, running multiple databases?
It is not.  There's no need for detailed discussion.

            regards, tom lane






--

Jason Antman | Systems Engineer | CMGdigital
jason.antman@xxxxxxxxxx | p: 678-645-4155




--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux