Re: [PATCH 2/5] xfs: roll the scrub transaction after completing a repair

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 24, 2023 at 10:05:38PM -0800, Christoph Hellwig wrote:
> On Fri, Nov 24, 2023 at 03:50:17PM -0800, Darrick J. Wong wrote:
> > Going forward, repair functions should commit the transaction if they're
> > going to return success.  Usually the space reaping functions that run
> > after a successful atomic commit of the new metadata will take care of
> > that for us.
> 
> Generally looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@xxxxxx>
> 
> A random comment on a pre-existing function from reading the code, and
> a nitpick on the patch itself below:
> 
> > +++ b/fs/xfs/scrub/agheader_repair.c
> > @@ -73,7 +73,7 @@ xrep_superblock(
> >  	/* Write this to disk. */
> >  	xfs_trans_buf_set_type(sc->tp, bp, XFS_BLFT_SB_BUF);
> >  	xfs_trans_log_buf(sc->tp, bp, 0, BBTOB(bp->b_length) - 1);
> > -	return error;
> > +	return 0;
> 
> After looking through the code this is obviously fine, error must
> be 0 here because the last patch touching it is xchk_should_terminate,
> which only sets the error if it returns true.

<nod>

> But the calling conventions for xchk_should_terminate really make me
> scratch my head as they are so hard to reason about.  I did quick
> look over must caller and most of them get there with error always
> set to 0.  So just making xchk_should_terminate return the error
> would seem a lot better to me - any caller with a previous error
> would need a second error2, but that seems better than what we have
> there right now.

Agreed, the callsites would be a bit more obvious if they looked like:

	error = xchk_should_terminate(sc);
	if (error)
		break;

Though I'm working on some tweaks of that function, since it was pointed
out to me that cond_resched() and fatal_signal_pending() aren't entirely
free.  What I've been testing out the last three weeks is:

	unsigned long now = jiffies;

	if (time_after(sc->next_poke, now)) {
		sc->next_poke = now + (HZ / 10);

		cond_resched();

		if (fatal_signal_pending(current))
			return -EINTR;
	}
	return 0;

So far I haven't seen much improvement, but the callsite change is
something that I think I could promote to the end of online repair
part 2.

> >  /* Repair the AGF. v5 filesystems only. */
> > @@ -789,6 +789,9 @@ xrep_agfl(
> >  	/* Dump any AGFL overflow. */
> >  	error = xrep_reap_agblocks(sc, &agfl_extents, &XFS_RMAP_OINFO_AG,
> >  			XFS_AG_RESV_AGFL);
> > +	if (error)
> > +		goto err;
> > +
> >  err:
> 
> This seems rather pointless and doesn't change anything..

Oops, lemme get rid of that dead code...

--D

> 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux