Re: Version Control?

Peter Fein <pfein@xxxxxxxxx> · Fri, 10 Jun 2005 10:38:52 -0500

Russ Brown wrote:
> On 6/9/05, elein <elein@xxxxxxxxxxx> wrote:
> 
>>On Thu, Jun 09, 2005 at 04:16:46PM -0500, John Browne wrote:
>>
>>>How would you handle the migration of the data with these user
>>>scripts?  Dump it to a temp table?
>>>
>>
>>If your scripts are correct, you should be able to load
>>your base scripts and apply each change script in order
>>and have the result be the exact same database schema.
>>
>>If they are not, checkpoint with a schema dump and start
>>again with the change scripts.  Of course getting the
>>scripts wrong is against the point of the whole exercise,
>>but it is not easy and requires vigilance.
>>
> 
> 
> The big complexity for me is that the the database schema's state
> should be stored along with the code that uses it: i.e. in CVS or
> Subversion or whatever with the code. That way you have a consistent
> snapshot of your complete system database at any given point in time
> (minus the data itself). Developers will need to re-dump the schema
> whenever they make a change to the datbase and commit it along with
> everything else, but that's easily scriptable.
> 
> Writing individual 'patch' scripts is fine for linear development, but
> breaks down when dealing with a development environment that involves
> branching. If two branches make changes to the database, each's patch
> file would be written against the original version, which may not be
> the case once the other patch has been apllied. What is needed is a
> tool which will compare any two revisions of the schema and generate a
> patch file that performs the migration.

This is interesting...  You'd want to be able to generate either a bunch
of CREATEs to create a schema from scratch or a 'patch' of ALTER
commands to move b/w arbitrary revisions or to a working copy (ie, a
live DB).  This implies you need to store an intermediate (non-SQL)
representation in your repository (like the output of WBDiff mentioned
previously).

What's unusual is that your SQL-generating tool would need to checkout
*both* versions of the representations to generate the patch.  I can't
think of any other problems that have this requirement - usually, you're
relying on your version control software to generate such diffs for you.

> This would obviously have to be pretty damn clever. Amongs the
> difficulties would be ensuring that the patch applies changes in the
> correct order (e.g. add column before adding foreign key). It's hard,
> but I don't believe it's impossible.

As an uninformed, off-the-wall idea, could one compare snapshots of the
system tables to generate these diffs? I know next-to-nothing about
these, but it seems like they'd contain the info you'd need.

This still doesn't solve the problem of identifying which tables should
have their *data* included  For that, I suppose one could create a table
for storing such metadata, or abuse COMMENT. It'd also be nice to be
able to specify the ability to exclude specified objects entirely - I'm
thinking of 3rd party modules (tsearch2, say) that one would want to
install by hand (to get the latest version or whatever).

I recently got a script to do something similar for MySQL by parsing the
output of 'show create table'. It's pretty simple & not full-featured.
I need to check with the author before posting it though.

If anyone is inclined to work on this, I might be able to route a few
dollars their way... please contact me off list.

-- 
Peter Fein                 pfein@xxxxxxxxx                 773-575-0694

Basically, if you're not a utopianist, you're a schmuck. -J. Feldman

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match