On Fri, Nov 24, 2023 at 10:05:38PM -0800, Christoph Hellwig wrote: > On Fri, Nov 24, 2023 at 03:50:17PM -0800, Darrick J. Wong wrote: > > Going forward, repair functions should commit the transaction if they're > > going to return success. Usually the space reaping functions that run > > after a successful atomic commit of the new metadata will take care of > > that for us. > > Generally looks good: > > Reviewed-by: Christoph Hellwig <hch@xxxxxx> > > A random comment on a pre-existing function from reading the code, and > a nitpick on the patch itself below: > > > +++ b/fs/xfs/scrub/agheader_repair.c > > @@ -73,7 +73,7 @@ xrep_superblock( > > /* Write this to disk. */ > > xfs_trans_buf_set_type(sc->tp, bp, XFS_BLFT_SB_BUF); > > xfs_trans_log_buf(sc->tp, bp, 0, BBTOB(bp->b_length) - 1); > > - return error; > > + return 0; > > After looking through the code this is obviously fine, error must > be 0 here because the last patch touching it is xchk_should_terminate, > which only sets the error if it returns true. <nod> > But the calling conventions for xchk_should_terminate really make me > scratch my head as they are so hard to reason about. I did quick > look over must caller and most of them get there with error always > set to 0. So just making xchk_should_terminate return the error > would seem a lot better to me - any caller with a previous error > would need a second error2, but that seems better than what we have > there right now. Agreed, the callsites would be a bit more obvious if they looked like: error = xchk_should_terminate(sc); if (error) break; Though I'm working on some tweaks of that function, since it was pointed out to me that cond_resched() and fatal_signal_pending() aren't entirely free. What I've been testing out the last three weeks is: unsigned long now = jiffies; if (time_after(sc->next_poke, now)) { sc->next_poke = now + (HZ / 10); cond_resched(); if (fatal_signal_pending(current)) return -EINTR; } return 0; So far I haven't seen much improvement, but the callsite change is something that I think I could promote to the end of online repair part 2. > > /* Repair the AGF. v5 filesystems only. */ > > @@ -789,6 +789,9 @@ xrep_agfl( > > /* Dump any AGFL overflow. */ > > error = xrep_reap_agblocks(sc, &agfl_extents, &XFS_RMAP_OINFO_AG, > > XFS_AG_RESV_AGFL); > > + if (error) > > + goto err; > > + > > err: > > This seems rather pointless and doesn't change anything.. Oops, lemme get rid of that dead code... --D >