Re: Unexpected data when subscribing to logical replication slot

Adrian Klaver <adrian.klaver@xxxxxxxxxxx> · Thu, 9 May 2024 08:07:12 -0700

On 5/9/24 00:32, Daniel McKenzie wrote:
    Asynchronous commit introduces the risk of data loss. There is a short
    time window between the report of transaction completion to the client
    and the time that the transaction is truly committed.

To get anywhere with this issue you will need to provide the information 
Tomas Vondra requested upstream:

"
Where/how does the enrichment query run? How does the whole process look 
like? I guess an application is receiving decoded changes as JSON, and 
then querying the database?
"

and

"
Would be good to have some sort of reproducer - ideally a script that
sets up an instance + replication, and demonstrates the issue. Or at
least a sufficiently detailed steps to reproduce it without having to
guess what exactly you did.
"

The documentation speaks about synchronous_commit changing how 
transactions change behaviour for the client. So in this case, my psql 
terminal is the client, and I would expect a faster commit (from its 
perspective) and then a period of risk (as a process usually done as 
part of the commit is now being done in the background) but it's not 
clear how that affects a replication slot subscriber.

What we're struggling to understand is: why are we seeing any updates in 
the replication slot before they have been "truly committed"?

From:

https://www.postgresql.org/docs/current/wal-async-commit.html

"As described in the previous section, transaction commit is normally 
synchronous: the server waits for the transaction's WAL records to be 
flushed to permanent storage before returning a success indication to 
the client. The client is therefore guaranteed that a transaction 
reported to be committed will be preserved, even in the event of a 
server crash immediately after. However, for short transactions this 
delay is a major component of the total transaction time. Selecting 
asynchronous commit mode means that the server returns success as soon 
as the transaction is logically completed, before the WAL records it 
generated have actually made their way to disk. This can provide a 
significant boost in throughput for small transactions."

It is about the state of the WAL record, in synchronous_commit=on the 
commit will not happen on the server and be seen by the client until the 
WAL record has been recorded as saved to disk. For 
synchronous_commit=off the commit happens as soon as COMMIT is reached 
in the transaction and the WAL record save happens after that.

I don't use wal2json so I don't know how it deals with the above.

There appears to be a state of limbo between updating data and that data 
being available to query (and our subscriber is picking up changes in 
this period of time) but I can't pin down any documentation which 
describes it.
The answer awaits a the full description of the process requested by 
Tomas Vondra.

Best guess, the fact that synchronous_commit=off 'cures' it implies that 
in synchronous_commit=on mode you are picking up data on the receiving 
and sending end at different points in '... the server waits for the 
transaction's WAL records to be flushed to permanent storage before 
returning a success indication to the client.'

We've had this running in live now for years without a hiccup so we are 
surprised to learn that we have this massive race condition and it just 

I would not say "...perhaps 1 in 50 times" is massive.

so happens that the hardware is fast enough to process the transaction 
before the .NET application can react to replication slot changes.

Daniel McKenzie
Software Developer

--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx