On 12/12/18 3:19 PM, Mike Lissner wrote:
This sounds *very* plausible. So I think there are a few takeaways:
1. Should the docs mention that additive changes with NOT NULL
constraints are bad?
It's not the NOT NULL it's the lack of a DEFAULT. In general a column
with a NOT NULL and no DEFAULT is going to to bite you sooner or later:)
At this point I have gathered enough of those bite marks to just make it
my policy to always provide a DEFAULT for a NOT NULL column.
2. Is there a way this could work without completely breaking
replication? For example, should Postgresql realize replication can't
work in this instance and then stop it until schemas are back in sync,
like it does with other incompatible schema changes? That'd be better
than failing in this way and is what I'd expect to happen.
Not sure as there is no requirement that a column has a specified
DEFAULT. This is unlike PK and FK constraint violations where the
relationship is spelled out. Trying to parse all the possible ways a
user could get into trouble would require something on the order of an
AI and I don't see that happening anytime soon.
3. Are there other edge cases like this that aren't well documented that
we can expect to creep up on us? If so, should we try to spell out
exactly *which* additive changes *are* OK?
Not that I know of. By their nature edge cases are rare and often are
dealt with in the moment and not pushed out to everybody. The only
solution I know of is pretesting your schema change/replication setup on
a dev installation.
This feels like a major "gotcha" to me, and I'm trying to avoid those. I
feel like the docs are pretty lacking here and that others will find
themselves in similarly bad positions.
Logical replication in core(not the pglogical extension) appeared for
the first time in version 10. On the crawl/walk/run spectrum it is
moving from crawl to walk. The docs will take some time to be more
complete. Just for the record my previous post was sketching out a
possible scenario not an ironclad answer. If you think the answer is
plausible and a 'gotcha' I would file a bug:
https://www.postgresql.org/account/login/?next=/account/submitbug/
Better schema migration docs would surely help, too.
Mike
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx