On Thu, May 14, 2020 at 10:22 AM Wim Bertels <wim.bertels@xxxxxxx> wrote:
Keith Fiske schreef op do 14-05-2020 om 10:08 [-0400]:
> It doesn't matter how small the dataset change is. The same WAL
> stream is used for both logical and physical replication so it has to
> keep all WAL files until all subscribers for that publication have
> confirmed they have received them. If even a single subscriber goes
> offline, all WAL will be kept until that subscriber reconnects.
That is interesting, i assume this the WAL for the whole cluster, as
logical decoding is then used on this WAL for the logical replication,
do you have an estimate of order of magnitude for the all WAL files?
So far this seems ok over here (with one subscriber inactive for 2
days):
# du -ch pg_logical/ pg_wal/ pg_replslot/
912K pg_logical/snapshots
4,0K pg_logical/mappings
924K pg_logical/
4,0K pg_wal/archive_status
81M pg_wal/
8,0K pg_replslot/db2_sub
8,0K pg_replslot/db2_sub1
8,0K pg_replslot/db2_sub2
28K pg_replslot/
81M totaal
this after two days of replication setup.
assuming that students will be offline for at most 2 or 3 days,
this seems ok?
--
mvg,
Wim
--
Tell the truth or trump--but get the trick.
-- Mark Twain, "Pudd'nhead Wilson's Calendar"
How much of an impact this will be is entirely dependent on the write rate of your cluster. If you have very few writes you may be fine. But I would definitely suggest getting some monitoring in place if you expect to have offline subscribers for any long period of time.
But, again, I would still try and rethink this strategy. Offline subscribers can be a very big problem if they never come back. Not only because you'd eventually fill up your disk, but also because that no longer allows PG to recycle its WAL files, or can cause excessive cleanup operations when that subscriber finally comes back, which can have big IO impacts.
--