On Tue, Jul 27, 2021 at 07:52:21AM +1000, Dave Chinner wrote: > On Mon, Jul 26, 2021 at 10:57:01AM -0700, Darrick J. Wong wrote: > > On Mon, Jul 26, 2021 at 04:07:15PM +1000, Dave Chinner wrote: > > > IOWs, attr3 leaf buffers fall through the magic number checks > > > unrecognised, so trigger the "recover immediately" behaviour instead > > > of undergoing an LSN check. IOWs, we incorrectly replay ATTR3 leaf > > > buffers and that causes silent on disk corruption of inode attribute > > > forks and potentially other things.... > > > > > > Git history shows this is *another* zero day bug, this time > > > introduced in commit 50d5c8d8e938 ("xfs: check LSN ordering for v5 > > > superblocks during recovery") which failed to handle the attr3 leaf > > > buffers in recovery. And we've failed to handle them ever since... > > > > I wonder, what happens if we happen to have a rt bitmap block where a > > sparse allocation pattern at the start of the rt device just happens to > > match one of these magic numbers + fs UUID? Does that imply that log > > recovery can be tricked into forgetting to replay rtbitmap blocks? > > Possibly. RT bitmap/summary buffers are marked by type in the > xfs_buf_log_format type field so log recovery can recognise these > and do the right thing with them. So it really comes down to whether > log recovery handles XFS_BLFT_RTBITMAP_BUF types differently to any > other buffers. Which, without looking at the code, I doubt it does, > so there's probably fixes needed there, too... It handles them the same as every other buffer, which is to say that I think we've found another recovery zeroday. xlog_recover_buf_commit_pass2 reads the ondisk buffer, and then calls xlog_recover_get_buf_lsn to fish the LSN out of the ondisk buffer. That second function doesn't corroborate the ondisk magic with the XFS_BLFT_* flags recovered from the buffer item, so if the log item was for an rt bitmap block and the user controls the rt layout as I describe above, they can totally screw up log recovery. Only after we return a garbage LSN do we call xlog_recover_do_reg_buffer -> xlog_recover_validate_buf_type and look at the buf_f flags to attach verifier ops, but by then it's too late to undo the damage. I think the answer is to combine the two functions so that we check the BLFT and the ondisk magic. If they match, we can set b_ops and return the ondisk LSN and then decide if we're really going to replay the bli contents. If they don't match, I guess we recover the whole bli? Or abort? I'll try to get to that after $meetings. --D > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx