> On 16. Sep 2024, at 09:14, Christian Theune <ct@xxxxxxxxxxxxxxx> wrote: > >> >> On 16. Sep 2024, at 02:00, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> >> I don't think this is a data corruption/loss problem - it certainly >> hasn't ever appeared that way to me. The "data loss" appeared to be >> in incomplete postgres dump files after the system was rebooted and >> this is exactly what would happen when you randomly crash the >> system. i.e. dirty data in memory is lost, and application data >> being written at the time is in an inconsistent state after the >> system recovers. IOWs, there was no clear evidence of actual data >> corruption occuring, and data loss is definitely expected when the >> page cache iteration hangs and the system is forcibly rebooted >> without being able to sync or unmount the filesystems… >> All the hangs seem to be caused by folio lookup getting stuck >> on a rogue xarray entry in truncate or readahead. If we find an >> invalid entry or a folio from a different mapping or with a >> unexpected index, we skip it and try again. Hence this does not >> appear to be a data corruption vector, either - it results in a >> livelock from endless retry because of the bad entry in the xarray. >> This endless retry livelock appears to be what is being reported. >> >> IOWs, there is no evidence of real runtime data corruption or loss >> from this pagecache livelock bug. We also haven't heard of any >> random file data corruption events since we've enabled large folios >> on XFS. Hence there really is no evidence to indicate that there is >> a large folio xarray lookup bug that results in data corruption in >> the existing code, and therefore there is no obvious reason for >> turning off the functionality we are already building significant >> new functionality on top of. I’ve been chewing more on this and reviewed the tickets I have. We did see a PostgreSQL database ending up reporting "ERROR: invalid page in block 30896 of relation base/16389/103292”. My understanding of the argument that this bug does not corrupt data is that the error would only lead to a crash-consistent state. So applications that can properly recover from a crash-consistent state would only experience data loss to the point of the crash (which is fine and expected) but should not end up in a further corrupted state. PostgreSQL reporting this error indicates - to my knowledge - that it did not see a crash consistent state of the file system. Christian -- Christian Theune · ct@xxxxxxxxxxxxxxx · +49 345 219401 0 Flying Circus Internet Operations GmbH · https://flyingcircus.io Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick