On Wed, Nov 29, 2023 at 07:18:05AM +0100, Christoph Hellwig wrote: > On Wed, Nov 29, 2023 at 08:08:37AM +1100, Dave Chinner wrote: > > > But places that use buf_get and manually fill in data only use it in a > > > few cases. > > > > Yes. the caller of buf_get always needs to set XBF_DONE if it is > > initialising a new buffer ready for it to be written. It should be > > done before the caller drops the buffer lock so that no other lookup > > can see the buffer in the state of "contains valid data but does not > > have XBF_DONE set". > > That makes sense, but we do have a whole bunch of weird things going > on as well: > > - xfs_buf_ioend_handle_error sets XBF_DONE when retrying or failing Write path. It is expected that XBF_DONE is set prior to submitting the write, so this should be a no-op. It's probably detritus from the repeated factoring of the buf_ioend code over the years that we've never looked deeply into. > - xfs_buf_ioend sets XBF_DONE on successful write completion as well Same. It's really only needed on read IO, but I'm guessing that somewhere along the line it was done for all IO and we've never stopped doing that. > - xfs_buf_ioend_fail drops XBF_DONE for any I/O failure That's actually correct. We're about to toss the buffer because we can't write it back. The very next function call is xfs_buf_stale(), which invalidates the buffer and it's contents. It should not have XBF_DONE and XBF_STALE set at the same time. It may be worthwhile to move the clearing of XBF_DONE to be inside xfs_buf_stale(), but I'd have to look at the rest of the code to see if that's the right thing to do or not. > - xfs_do_force_shutdown sets XBF_DONE on the super block buffer on > a foced shutdown No idea why that exists. I'd have to go code spelunking to find out what it was needed for. > - xfs_trans_get_buf_map sets XBF_DONE on a forced shutdown That code just looks broken. we're in the middle of a transaction doing a buffer read, we found a match and raced with a shutdown so we stale the buffer, mark it donei, bump the recursion count and return it without an error? A quick spelunk through history indicates stale vs undelay-write vs XFS_BUF_DONE was an inconsistent mess. We fixed all the stale vs done nastiness in the xfs_trans_buf_read() path, but missed this case in the get path. As it is, I'd say the way shutdown racing is handled here is broken and could be fixed by returning -EIO instead of the staled buffer. The buffer is already linked into the transaction, so the transaction cancel in response to the IO error will handle the buffer appropriately.... > So there's definitively a bunch of weird things not fully in line > with the straight forward answer. No surprises there - this is a 30 year old code base. Nothing looks particularly problematic, though. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx