Search Postgresql Archives

Re: Restore postgresql data directory to tablespace on new host? Or swap tablespaces?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/15/2014 02:00 PM, Francisco Olarte wrote:
> Hi:
>
> On Sat, Feb 15, 2014 at 7:31 PM, Antman, Jason (CMG-Atlanta)
> <Jason.Antman@xxxxxxxxxx> wrote:
>> Well thanks for someone at least sending a reply, though I suppose I
>> should have asked "how do I do this", or "what are the major hurdles to
>> doing this", as it obviously has to be *possible* given unlimited
>> knowledge, resources and time.
> Some things have been proven impossible, maybe this is one of those.
> Maybe the lack of answers is due people having the same impression to
> your your original message as me. I mean he NEEDS 100-150 dev/test
> copies?
The short answer is... due to too much technical debt, and some perhaps 
bad decisions made in the past... yeah. We've dug ourselves into this 
hole, and there's no feasible way out. Currently we have ~50 dev/test 
environments, each of which has a clone of the full production database 
(on a NetApp, using flexclones, so it's *much* smaller in terms of 
actual disk usage). Each of them is backed up its own postgres instance 
running in a dedicated VM. Data refreshes are done on-demand by stopping 
the postgres instance, unmounting and deleting the volume backing it, 
making a new clone of the daily snapshot, mounting it, and bringing 
postgres back up. But we don't have the money (or, more importantly, 
datacenter space) to dedicate a piece of hardware to each of these. So 
most dev environments can only handle ~1 person using the application at 
a time, due to trying to run a complex ORM (Django) against a ~1T 
database on a host with < 24G RAM. Actual testing happens in a different 
environment, which runs on bare metal. Load tests happen in an even more 
different environment, with a master and slave.

So, we really need ~50 copies that can be refreshed at arbitrary 
intervals. Since I've already determined that's absolutely impossible 
without stopping postgres, I calculated 100-150 assuming we do bulk 
refreshes at night, and each developer is allocated 2-3 possible 
refreshes in a given 24-hour period.

Since we've allowed the database to grow without using Django's 
facilities to split largely unrelated objects into different databases 
and handle any JOINs in the application, and we've gradually let the 
database diverge from the models actually in the ORM, every attempt at 
crafting a DB with a restricted set of recent data has failed. And yes, 
unfortunately, individual developers have had their own private database 
instance for long enough that migrating to a shared DB approach has been 
discarded as an option.

>
>> Perhaps I should frame the question differently:
>> If you had a single ~1TB database, and needed to be able to give fresh
>> data copies to dev/test environments (which are usually largely idle)
>> either on demand or daily, how would you do it? The only other thing
>> that comes to mind is separate postgres instances (running multiple
>> postgres instances per server?), one per database, for every
>> environment. Which means that if 80% of the environments are idle at a
>> given time, I'm effectively wasting 80% of the memory that I have
>> allocated to shared buffers, etc. and I actually need 4x the resources
>> I'm using? Unless postgres supports balooning of memory?
> If I NEEDED to be able to provide 100-150 snapshots to test/dev
> environments 20% of which maybe active, I'll setup a cluster, buy
> somewhere above a quarter terabyte RAM and some big snapshot
> appliances. They are not that expensive, and someone NEEDING that big
> dev/test systems must be on a humoungous project, where the cost of
> throwing hardware at the problem would be dwarfed by any small delay
> trying to set that up with minimal resources. A postgres instance
> should happily run in half a gig, using proper snapshots would let you
> share cache among them, so you could put quite a lot of them in one of
> these 384G machines floating around, even on a 192G one, throw a
> snapshotting filesystem or appliance on the mix and is not that
> complex.
I'm not sure what you mean by "snapshot appliance". If you mean 
something like a NetApp filer, yeah, that's what we're using. 
Unfortunately, no, we're on a project that's far from having an 
unlimited budget. The requirement for that big of a dev/test system is 
above... essentially, a lot of bad decisions that got us to the point of 
having a single ~1T database and no successful attempt at removing any 
significant amount of data from it.

So you're suggesting running multiple instances of postgres in the same 
operating system, and they'll be able to share cache, instead of trying 
to get multiple DBs in the same instance? Is there any accepted method 
of managing N postgres instances on one host? Any tools or documentation 
for that, or best practices? Seems scary to me, but I suppose it's the 
best option we have...

By "A postgres instance should happily run in half a gig" I assume 
you're not counting cache/buffers, etc.? Because when trying to run 
automation (i.e. automated testing) on our application, we can't get 
query execution time down to an acceptable level with anything less than 
~48GB memory available in the host.

Thanks for the suggestion though. I was considering this as an 
alternative, but every experience I've had of running multiple (let's 
say, more than 2 or 3) instances of the same application/daemon on the 
same host has inevitably led to all sorts of strange issues and 
problems... I suppose it sounds like it's going to be the only option 
though...

>
> Francisco Olarte.


-- 

Jason Antman | Systems Engineer | CMGdigital
jason.antman@xxxxxxxxxx | p: 678-645-4155


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux