I've got an interesting issue. We're running postgres 9.1.1 linux x64 centos 5.8
aspdata=# select version();
version
---------------------------------------------------------------------------------------------------------------
PostgreSQL 9.1.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52), 64-bit
We currently have 1 DB we use for multiple independent tenant schemas. The database size is current 56227005240 bytes as reported by pg_database_size.
There are 557 schemas each with about 1300 objects (760 tables 520 views).
We are using pg_dump to do backups of a single schema with a total size of (5480448 bytes calculated with SELECT sum(pg_relation_size(schemaname || '.' || tablename))::bigint FROM pg_tables WHERE schemaname ='miketest';)
pg_dump -f /dumps/test.backup -Fc -n miketest aspdata
This dump is currently taking around 8 minutes. While dumping the pg_dump process is using 100% of one core in the server (24 core machine). Doing a -v pg_dump I found that the following stages are taking the majority of the time
reading user_defined tables (2 minutes and 20 seconds)
reading dependency data (5 minutes and 30 seconds)
The size of the schema doesn't really seem to effect theses times are almost identical for a 700 meg schema as well (obviously the data dump portion takes longer with the bigger db)
During the reading user_defined tables the following query shows up for a 10-20 seconds then the pg_dump connection sits idle for the rest of the 2 minutes:
SELECT c.tableoid, c.oid, c.relname, c.relacl, c.relkind, c.relnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = c.relowner) AS rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules, c.relhasoids, c.relfrozenxid, tc.oid AS
toid, tc.relfrozenxid AS tfrozenxid, c.relpersistence, CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, (SELECT spcname FROM pg_tablespace
t WHERE t.oid = c.reltablespace) AS reltablespace, array_to_string(c.reloptions, ', ') AS reloptions, array_to_string(array(SELECT 'toast.' || x FROM unnest(tc.reloptions) x), ', ') AS toast_reloptions FROM pg_class c LEFT JOIN pg_depend
d ON (c.relkind = 'S' AND d.classid = "" AND d.objid = c.oid AND d.objsubid = 0 AND d.refclassid = c.tableoid AND d.deptype = 'a') LEFT JOIN pg_class tc ON (c.reltoastrelid = tc.oid) WHERE c.relkind in ('r', 'S', 'v', 'c', 'f')
ORDER BY c.oid
During the reading dependency data the following queries show up for a few seconds then the connection sits idle for the rest of the 5.5 minutes:
SELECT tableoid
, oid, typname, typnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) AS rolname, typinput::oid AS typinput, typoutput::oid AS typoutput, typelem, typrelid, CASE WHEN typrelid = 0 THEN ' '::"char" ELSE (SELECT relk
ind FROM pg_class WHERE oid = typrelid) END AS typrelkind, typtype, typisdefined, typname[0] = '_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid = pg_type.typelem) = oid AS isarray FROM pg_type
SELECT classid,
objid, refclassid, refobjid, deptype FROM pg_depend WHERE deptype != 'p' AND deptype != 'e' ORDER BY 1,2
Non production servers with less schemas don't seem to have any issue and perform the same dump in under 10 seconds on much lower classed hardware.
Server Specs:
2 x Intel Xeon X5650
32 Gigs of Ram
DELL Perc H700 Controller
Data drive - 6XSAS2 15K in RAID10 FS: xfs
Log Drive - 2XSAS2 15K in RAID1 FS: xfs
There are 2 of these machine one master other slaved via streaming replication over gigabit network.
Thanks