On Thu, May 24, 2012 at 08:20:34PM -0700, Jeff Janes wrote: > On Thu, May 24, 2012 at 8:21 AM, Craig James <cjames@xxxxxxxxxxxxxx> wrote: > > > > > > On Thu, May 24, 2012 at 12:06 AM, Hugo <Nabble> <hugo.tech@xxxxxxxxx> wrote: > >> > >> Hi everyone, > >> > >> We have a production database (postgresql 9.0) with more than 20,000 > >> schemas > >> and 40Gb size. In the past we had all that information in just one schema > >> and pg_dump used to work just fine (2-3 hours to dump everything). Then we > >> decided to split the database into schemas, which makes a lot of sense for > >> the kind of information we store and the plans we have for the future. The > >> problem now is that pg_dump takes forever to finish (more than 24 hours) > >> and > >> we just can't have consistent daily backups like we had in the past. When > >> I > >> try to dump just one schema with almost nothing in it, it takes 12 > >> minutes. > > Sorry, your original did not show up here, so I'm piggy-backing on > Craig's reply. > > Is dumping just one schema out of thousands an actual use case, or is > it just an attempt to find a faster way to dump all the schemata > through a back door? > > pg_dump itself seems to have a lot of quadratic portions (plus another > one on the server which it hits pretty heavily), and it hard to know > where to start addressing them. It seems like addressing the overall > quadratic nature might be a globally better option, but addressing > just the problem with dumping one schema might be easier to kluge > together. Postgres 9.2 will have some speedups for pg_dump scanning large databases --- that might help. -- Bruce Momjian <bruce@xxxxxxxxxx> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance