Re: PostgreSQL logical replication depends on WAL segments?

Adrian Klaver <adrian.klaver@xxxxxxxxxxx> · Tue, 22 Jan 2019 09:53:26 -0800

On 1/22/19 9:10 AM, Jeremy Finzel wrote:
    Thanks, I see... So if I understand it correctly - since I have
    quite big partitions like ~30 GB each in one parent table and from
    ~1GB to ~5 GB in several others I presume I had to set
    wal_keep_segments to some really high number and stop our security
    cronjob cleaning old WAL segments (because we already had some
    problems with almost full disk due to old WAL segments) until the
    whole transfer of snapshot is done. Because only after the whole
    snapshot is transferred logical replication workers start to
    transfer WAL logs reflecting changes done from the moment snapshot
    was taken...

    jm

Understand there are other downsides to just keeping around a huge 
amount of WAL segments apart from only taking up disk space.  None of 
the data held in those WAL segments can be vacuumed away while they are 
left around, which can lead to significant bloat and performance issues 
over time.

That is news to me. Can you provide a citation for this?

I'm not exactly clear on your use case, but if you need to just 
resychronize data for a single table, there is a built-in way to do that 
(actually would be nice if the docs spelled this out).

On publisher:

ALTER PUBLICATION mypub DROP TABLE old_data_table;

On subscriber:

ALTER SUBSCRIPTION mysub REFRESH PUBLICATION WITH ( COPY_DATA = true);

On publisher:

ALTER PUBLICATION mypub ADD TABLE old_data_table;

On subscriber:

ALTER SUBSCRIPTION mysub REFRESH PUBLICATION WITH ( COPY_DATA = true);

The last command will resync the table from the current table data, 
regardless of the WAL file situation.  This is the "normal" way you 
would go about resynchronizing data between clusters when a long time 
has passed, rather than trying to keep all that WAL around!

So far as I can tell from testing, above pattern is the easiest way to 
do this, and it will not resynchronize any of the other tables in your 
subscription.

P.S. do heed the advice of the others and get more familiar with the 
docs around WAL archiving.

Thanks,
Jeremy

--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx