On Sun, Jun 10, 2012 at 4:47 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote: > On Wed, May 30, 2012 at 2:06 AM, Tatsuo Ishii <ishii@xxxxxxxxxxxxxx> wrote: >>> Yeah, Jeff's experiments indicated that the remaining bottleneck is lock >>> management in the server. What I fixed so far on the pg_dump side >>> should be enough to let partial dumps run at reasonable speed even if >>> the whole database contains many tables. But if psql is taking >>> AccessShareLock on lots of tables, there's still a problem. >> >> Ok, I modified the part of pg_dump where tremendous number of LOCK >> TABLE are issued. I replace them with single LOCK TABLE with multiple >> tables. With 100k tables LOCK statements took 13 minutes in total, now >> it only takes 3 seconds. Comments? > > Could you rebase this? I tried doing it myself, but must have messed > it up because it got slower rather than faster. OK, I found the problem. In fixing a merge conflict, I had it execute the query every time it appended a table, rather than just at the end. With my proposed patch in place, I find that for a full default dump your patch is slightly faster with < 300,000 tables, and slightly slower with > 300,000. The differences are generally <2% in either direction. When it comes to back-patching and partial dumps, I'm not really sure what to test. For the record, there is still a quadratic performance on the server, albeit with a much smaller constant factor than the Reassign one. It is in get_tabstat_entry. I don't know if is worth working on that in isolation--if PG is going to try to accommodate 100s of thousands of table, there probably needs to be a more general way to limit the memory used by all aspects of the rel caches. Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance