On Fri, Jun 13, 2008 at 11:11 PM, Alvaro Herrera <alvherre@xxxxxxxxxxxxxxxxx> wrote: > Tom Lane wrote: >> "James B. Byrne" <byrnejb@xxxxxxxxxxxxx> writes: > >> > GiT works by compressing deltas of the contents of successive versions of file >> > systems under repository control. It treats binary objects as just another >> > object under control. The question is, are successive (compressed) dumps of >> > an altered database sufficiently similar to make the deltas small enough to >> > warrant this approach? >> >> No. If you compress it, you can be pretty certain that the output will >> be different from the first point of difference to the end of the file. >> You'd have to work on uncompressed output, which might cost more than >> you'd end up saving ... > > The other problem is that since the tables are not dumped in any > consistent order, it's pretty unlikely that you'd get any similarity > between two dumps of the same table. To get any benefit, you'd need to > get pg_dump to dump sorted tuples. > > -- > Alvaro Herrera http://www.CommandPrompt.com/ > The PostgreSQL Company - Command Prompt, Inc. > > -- > Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general The idea of using GIT for backing-up databases is not that bad. I would propose the following: -- dump the creation script in a separate file; (or maybe one file per object (table, view, function) etc.;) -- dump the content of each table in it's own file; -- dump the tuples sorted but in plain text (as COPY data or INSERTS maybe); (as Alvaro suggested); -- don't use compression (as Tom and Chander suggested) because GIT already uses compression for the packed files; One advantage of using GIT in the manner described previously will be change tracking by doing just a simple git diff you could see the modifications (inserts, updates, deletes, etc., schema alteration). Going a step further you could also do merges between multiple databases with the same structure (each database would have it's own branch). Just imagine how simple a database schema upgrade will be in most situations, when both the development and the deployed schema have been modified and we want to put them into sync. As a conclusion I would subscribe to such an idea. Ciprian Craciun.