Hello,
Running 13.1 on Windows Server 2019, I am getting the following log entries occasionally:
2021-02-11 12:34:10.149 NZDT [6072] LOG: could not rename
file "pg_wal/0000000100000099000000D3": Permission denied
2021-02-11 12:40:31.377 NZDT [6072] LOG: could not rename
file "pg_wal/0000000100000099000000D3": Permission denied
2021-02-11 12:46:06.294 NZDT [6072] LOG: could not rename
file "pg_wal/0000000100000099000000D3": Permission denied
2021-02-11 12:46:16.502 NZDT [6072] LOG: could not rename
file "pg_wal/0000000100000099000000DA": Permission denied
2021-02-11 12:50:20.917 NZDT [6072] LOG: could not rename
file "pg_wal/0000000100000099000000D3": Permission denied
2021-02-11 12:50:31.098 NZDT [6072] LOG: could not rename
file "pg_wal/0000000100000099000000DA": Permission denied
What appears to be happening is the affected WAL files (which is
usually only 2 or 3 WAL files at a time) are somehow "losing"
their NTFS permissions, so the PG process can't rename them -
though of course the PG process created them. Even running icacls
as admin gives "Access is denied" on those files. A further oddity
is the affected files do end up disappearing after a while.
The NTFS permissions on the pg_wal directory are correct, and
most WAL files are unaffected. Chkdsk reports no problems, and the
database is working fine otherwise. Have tried disabling antivirus
software in case that was doing something but no difference.
I found another recent report of similar behaviour here:
https://stackoverflow.com/questions/65405479/postgresql-13-log-could-not-rename-file-pg-wal-0000000100000001000000c6
WAL config as follows:
wal_level = replica
fsync = on
synchronous_commit = on
wal_sync_method = fsync
full_page_writes = on
wal_compression = off
wal_log_hints = off
wal_init_zero = on
wal_recycle = on
wal_buffers = -1
wal_writer_delay = 200ms
wal_writer_flush_after = 1MB
wal_skip_threshold = 2MB
commit_delay = 0
commit_siblings = 5
checkpoint_timeout = 5min
max_wal_size = 2GB
min_wal_size = 256MB
checkpoint_completion_target = 0.7
checkpoint_flush_after = 0
checkpoint_warning = 30s
archive_mode = off
I'm thinking of disabling wal_recycle as a first step to see if
that makes any difference, but thought I'd seek some comments
first.
Not sure how much of a problem this is - the database is running
fine otherwise - but any thoughts would be appreciated.
Thanks & regards,
Guy