On 12/11/17 19:15, Karsten Hilbert wrote:
On Sat, Nov 11, 2017 at 01:03:18PM +0000, Mark Morgan Lloyd wrote:
Several legacy programs written in Delphi ground to a halt this morning,
which turned out to be because a Debian system had updated its copy of
PostgreSQL and restarted the server, which broke any live connections.
At least some versions of Delphi, not to mention other IDE/RAD tools with
database-aware components, don't automatically try to reestablish a database
session that's been interrupted. In any event, an unexpected server restart
(irrespective of all investment in UPSes etc.) has the potential of playing
havoc on a clustered system.
Is there any way that either the package maintainer or a site
administrator/programmer such as myself can mark the Postgres server
packages as "manual upgrade only" or similar? Or since I'm almost certainly
not the first person to be bitten by this, is there a preferred hack in
mitigation?
Apart from that (putting packages on hold), PostgreSQL
updates on Debian don't upgrade existing clusters
automatically. They do create a new cluster but the old one
is kept around and stays running, IIRC even on the very same
port.
(Having gone all the way from PG 7.1 to PG 10 on Debian :)
With the caveat that Debian has only comparatively-recently introduced
unattended updates as the default... I think only with Stretch. If
you're still on Jessie you can yet be saved :-)
What did
pg_lsclusters
say ?
I don't have it from the time of the problem, but currently it gives me
Ver Cluster Port Status Owner Data directory Log file
9.6 main 5432 online postgres /var/lib/postgresql/9.6/main
/var/log/postgresql/postgresql-9.6-main.log
i.e. a single-server system, although I've since done a manual restart
so that I could change some DIMMs.
However syslog and postgresql-9.6-main.log show me this:
Nov 11 06:27:38 postgres1 systemd[1]: Starting Daily apt upgrade and
clean activities...
Nov 11 06:28:05 postgres1 systemd[1]: Reloading.
Nov 11 06:28:07 postgres1 systemd[1]: Reloading.
Nov 11 06:28:07 postgres1 systemd[1]: Stopped PostgreSQL RDBMS.
Nov 11 06:28:07 postgres1 systemd[1]: Stopping PostgreSQL Cluster
9.6-main...
Nov 11 06:28:08 postgres1 systemd[1]: Stopped PostgreSQL Cluster 9.6-main.
Nov 11 06:28:10 postgres1 systemd[1]: Reloading.
2017-11-11 06:28:07.587 UTC [675] LOG: received fast shutdown request
2017-11-11 06:28:07.587 UTC [675] LOG: aborting any active transactions
[Session names here]
2017-11-11 06:28:07.607 UTC [730] LOG: autovacuum launcher shutting down
[More session names here]
2017-11-11 06:28:07.680 UTC [727] LOG: shutting down
2017-11-11 06:28:07.984 UTC [675] LOG: database system is shut down
2017-11-11 06:28:13.039 UTC [11122] LOG: database system was shut down
at 2017-11-11 06:28:07 UTC
2017-11-11 06:28:13.081 UTC [11122] LOG: MultiXact member wraparound
protections are now enabled
2017-11-11 06:28:13.085 UTC [11126] LOG: autovacuum launcher started
2017-11-11 06:28:13.085 UTC [11121] LOG: database system is ready to
accept connections
2017-11-11 06:28:13.371 UTC [11128] [unknown]@[unknown] LOG: incomplete
startup packet
All live applications saw that as a loss of database connectivity, yet
when I was alerted by their squeals of anguish (MIDI on app servers has
its uses :-) I found the database server running and accepting connections.
There must have been something additional at play.
The apps are written in Delphi, I admit not a very recent version and
they're due to be converted to Lazarus which is an open-source and
portable clone. I'll defend my choice of language since it is,
basically, the best "4GL" you'll find.
However one flaw of Delphi etc. is that they assume that they can safely
hold a database session open for an extended period. I can't speak for
Delphi any more since it has, basically, priced itself out of our league
particularly taking into account its lack of portability, but
FPC/Lazarus appears to have something which is intended to reconnect a
lost session, although it's so far unimplemented.
So I've got multiple options for fixing this at the application level:
either fill in the unimplemented bit of the database control in the
Lazarus Class Library, or prevent apps from holding database connections
open. But the real problem, I feel, is that Debian is enabling
unattended upgrades without checking with the user, and while an
attended upgrade normally asks for confirmation before restarting a
daemon an unattended one doesn't.
--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk
[Opinions above are the author's, not those of his employers or colleagues]
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general