Re: Recreating functions after starting the database server.

raf@xxxxxxx · Fri, 2 Feb 2018 09:25:04 +1100

Konrad Witaszczyk wrote:

> Hi,
> 
> I'm considering changing my database schema upgrade routines to recreate
> functions and all objects depending on them (e.g. triggers, views) just after
> the database server is started. It would make the routines much simpler since
> they wouldn't have to depend on the history of the schema.
> 
> Does anyone has any experience with recreating all functions and triggers to
> upgrade a database schema assuming that during the upgrade there are no client
> connections to the database?
> 
> Does anyone see any problems with such approach in terms of consistency and
> performance? I'm not familiar with PostgreSQL internals and I'm not sure how it
> would affect the planner when using various function volatile categories.
> 
> Konrad

Hi,

I think the usual way to do this is to have a table containing a
schema version number that can be consulted to work out where
the schema is up to so as to know which schema migration scripts
still need to be run to bring the database up to date. This is
useful for upgrading a system on multiple database servers. All
the cool kids would be doing this. However, I think this is more
common with ORMs where all the code lives outside the database.

I often have to load old database backups that need to be
brought up to date so that the current software will work with
it. I have date/time stamps in the filename of the database
backup and my script to load the backup consults a directory
full of all schema migration scripts that are also date/time
stamped. It then applies the schema migration scripts that are
dated after the time that the backup was made.

After applying all the schema migration scripts, it runs another
script to audit the state of the database functions with respect
to the "source". It then reloads any functions that differ from
the current source. I don't use views or triggers but the same
idea would apply.

My main advice is to not call any functions (or rely on any
views or triggers) inside schema migration scripts because you
never know if they'll exist or not when they are needed.

Also, if any update scripts massage data rather than just change
the schema, don't assume that the data exists just because it
exists when you wrote the migration script. The script might run
on a very old backup taken before the data existed. Or at least
write migrations that avoid producing errors if the data isn't
present. Minimise the output so that errors don't go unnoticed.

I'd also recommend writing a tool that compares the functions,
triggers and views in the database against the code for them on
disk somewhere and reload the ones that differ from source. Once
you have such an audit tool, it's very easy to see what needs to
be reloaded and the tool can do that too.

A very important part of such an audit tool is that it
(probably?) needs to perform a topological sort on the
functions, triggers and views to load them in a sensible order.
Inspect the source for them all looking for references between
them and do a topoligical sort so you can load things that are
depended on before loading the things that depend on them. This
might not actually be necessary but I think it is (or I wouldn't
have bothered doing it unless it's cargo cult behaviour on my
part from Sybase days). :-)

I use the function auditing tool to load new or changed
functions on multiple servers all the time but I only apply old
schema migration scripts when loading backups.

One last thing, if you need to keep very old backups
indefinitely for legal reasons, make a point of reloading them
every year to apply the last year's worth of schema changes and
make a backup of that updated version. It makes it less likely
that bit rot will ruin your day in the distant future. But also
keep the original backup because sometimes you need the backup
as it was without any changes made after the fact. The ancient
version won't work with current application code but that can be
the problem of whoever has demanded the data in its original
form.

I've never encountered any problems with consistency and
performance issues in doing something like this but I only have
small databases so that doesn't mean there won't be any.

Good luck!

cheers,
raf