Hi, On 2018-09-14 21:01:54 +0200, Andreas Brandl wrote: > we're discussing a system design and it boils down to a question > regarding commit timestamps (in the sense of [1], > track_commit_timestamp='on'): > > We have a insert-only (append-only) table. Do commit timestamps on > this table constitute the same order in which records become visible > on a secondary (streaming replication)? Is there any reason why this > might not be the case? > > To put this differently: If a client reads from a secondary and reads > the "latest" record by commit timestamp (commit time T). Is it safe to > assume that there won't ever be another record with a lower commit > timestamp < T, that only shows up on the secondary after that read? No, that's not safe to assume. The order of visibility on the standby is determined by the point the WAL record is inserted into the log. The commit timestamp has to *previously* be determined to be part of the WAL log: SetCurrentTransactionStopTimestamp(); XactLogCommitRecord(xactStopTimestamp, nchildren, children, nrels, rels, nmsgs, invalMessages, RelcacheInitFileInval, forceSyncCommit, MyXactFlags, InvalidTransactionId, NULL /* plain commit */ ); Those two things happen sequentially, *WITHOUT* a lock preventing concurrent transactions to do the same. > I'm aware of concerns regarding physical time, time adjustments and so > on, so the question here assumes those things never happen. Note that in addition to that you cannot be certain that concurrent transactions *in the same postgres cluster* have perfectly coherent timestamp - clock drift between CPU sockets does still sometimes occur, and was extremely common. Greetings, Andres Freund