-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 François Beausoleil asked: > To that end, we'd like to anonymize the data before it leaves the database server. > > One solution we thought of would be to run statements prior to pg_dump, but within > the same transaction, something like this: > > BEGIN; > UPDATE users SET email = 'dev+' || id || '@example.com', password_hash = '/* hash of "password" */', ...; > -- launch pg_dump as usual, ensuring a ROLLBACK at the end ... > Is there a ready-made solution for this? No - at least not with generating a dump and scrubbing *before* the data comes out. Some other ideas: * Periodically do a full dump to another database under your control, sanitize the data, and make all dev dumps come from *that* database. Process roughly becomes: * pg_dump herokudb | psql dev_temp * <sanitize dev_temp> * drop existing dev_old; rename devdb to dev_old; rename dev_temp to devdb * Devs can pg_dump devdb at will That still moves your sensitive data to another server though, even temporarily. Another approach is to use the -T flag of pg_dump to exclude certain tables. Make modified copies of them on the server, then rename them after the dump (or simply put them in a new namespace): * (create a dev.users identical (including indexes, etc.) to public.users) * truncate table dev.users; * insert into dev.users select * from public.users; * update dev.users set email = 'dev' + || ...etc. <e.g. sanitize data> * pg_dump -d <herokus DATABASE_URL> -T public.users > devs_use_this.pg Then dev could do: set schema = dev, public; Or you could simply move the sanitized table back: alter table dev.users set schema public; If you are going to rename, it may be simpler to not make an identical copy of the affected tables (i.e. with indexes) but just a data-only copy: create table dev.users as select * from public.users; <sanitize dev.users> pg_dump herokudb --schema-only | psql devdb pg_dump herokudb --data-only -T public.users | psql devdb psql devdb -c 'insert into public.users select * from dev.users' - -- Greg Sabino Mullane greg@xxxxxxxxxxxx End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201703240911 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAljVHHsACgkQvJuQZxSWSshUbgCg7TzCkAzT4wKoKd5/2rruzLte TJcAoI7AvGdGzlNp5b3N+LFJ9DWIZ8/C =7heB -----END PGP SIGNATURE----- -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general