Hi, facing the same situation again, but this time, analyzing the wal with xxd shows a different pattern. I had no blocks of 0000. The output of pg_waldump is: pg_waldump: fatal: error in WAL record at 11C/93F9FF70: invalid magic number 0000 in log segment 000000010000011C00000093, offset 16384000 The output of xxd -C16 is 00f9ff60: b364 0079 6e61 6d69 6320 6c80 0300 0000 .d.ynamic l..... 00f9ff70: 4000 0000 6659 a406 60f7 f993 1c01 0000 @...fY..`....... 00f9ff80: 000b 0000 82b3 8d9b 0020 1000 7f06 0000 ......... ...... I'm still unable to determine the cause of the issue, nor if the issue is on the primary server sending a corrupted wal segment, or on the secondary receiving a corrupted wal segment, or the openzfs filesystem on the primary allowing wal_sender to read still-not-written wal segment, or ... Is there any log option I can add on the two clusters to help me locate the issue's origin? thanks, Nicolas. On Tuesday, April 16th, 2024 at 09:56, Nicolas Seinlet <nicolas@xxxxxxxxxxx> wrote: > > > Hello, > > > What exactly is "cyphered ZFS"? Can you reproduce the problem with some > > other filesystem? If it's something very unusual, it might well be a > > bug in the filesystem. > > > The filesystem is openzfs with native aes-256-gcm encryption: > https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops.7.html#encryption > > I've not tested if we get the same issue on another filesystem. > > I don't face the issue on Ubuntu 20.04/openzfs 0.8/PostgreSQL 12, but I have fewer systems with this deployment. > On Ubuntu 22.04/openzfs 2.1.5/PostgreSQL 14, I face the issue from time to time, without knowing what triggers the error. > > thanks for helping, > > Nicolas.
Attachment:
signature.asc
Description: OpenPGP digital signature