Re: [PATCH 3/3] xfs: terminate perag iteration reliably on end agno

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 08, 2021 at 10:45:13AM -0400, Brian Foster wrote:
> On Fri, Oct 08, 2021 at 10:02:59AM +1100, Dave Chinner wrote:
> > On Thu, Oct 07, 2021 at 08:50:53AM -0400, Brian Foster wrote:
> > > diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h
> > > index d05c9217c3af..edcdd4fbc225 100644
> > > --- a/fs/xfs/libxfs/xfs_ag.h
> > > +++ b/fs/xfs/libxfs/xfs_ag.h
> > > @@ -116,34 +116,30 @@ void xfs_perag_put(struct xfs_perag *pag);
> > >  
> > >  /*
> > >   * Perag iteration APIs
> > > - *
> > > - * XXX: for_each_perag_range() usage really needs an iterator to clean up when
> > > - * we terminate at end_agno because we may have taken a reference to the perag
> > > - * beyond end_agno. Right now callers have to be careful to catch and clean that
> > > - * up themselves. This is not necessary for the callers of for_each_perag() and
> > > - * for_each_perag_from() because they terminate at sb_agcount where there are
> > > - * no perag structures in tree beyond end_agno.
> > 
> > We still really need an iterator for the range iterations so that we
> > can have a consistent set of behaviours for all iterations and
> > don't need a special case just for the "mid walk break" where the
> > code keeps the active reference to the perag for itself...
> > 
> 
> Ok, but what exactly are you referring to by "an iterator" beyond what
> we have here to this point? A walker function with a callback or
> something? And why wouldn't we have done that in the first place instead
> of introducing the API wart documented above?

We didn't do it in the first place because we started with the tag
walks that only ever terminate on NULL - that was Darrick's code
used in the inode cache walk functions.. Converting the rest of
agcount based walks was not straight-forward - there were different
cases that needed to be handled and this was the least worst way to
begin the conversion needed for shrink.

Looking back at the history, it was that last conversion for
GETFSMAP that caused the problems sb_agcount based walks in commit
58d43a7e3263 ("xfs: pass perags around in fsmap data dev functions")
where for_each_perag_from() was converted to
for_each_perag_from_range(). There was no termination leak before
this, and races with growfs were simply handled with termination by
NULL lookup return. The bug iwas introduced because we moved away
from termination on NULL lookup by adding a termination via loop
index to support GETFSMAP.

We knew about the API wart and put a plan in place to address it in
future by movign to iterators. That looks something like this but
without the C99 automatic cleanup so external iterator setup and
teardown:

https://lore.kernel.org/linux-xfs/162814685996.2777088.11268635137040103857.stgit@magnolia/

So, yes, we need to fix the bug that was introduced, but asking why
the code wasn't perfect before it was merged doesn't help us make
small steps forwards towards solving the bigger problems we need to
address...

> > >  }
> > >  
> > >  #define for_each_perag_range(mp, agno, end_agno, pag) \
> > >  	for ((pag) = xfs_perag_get((mp), (agno)); \
> > > -		(pag) != NULL && (agno) <= (end_agno); \
> > > -		(pag) = xfs_perag_next((pag), &(agno)))
> > > +		(pag) != NULL; \
> > > +		(pag) = xfs_perag_next((pag), &(agno), (end_agno)))
> > >  
> > >  #define for_each_perag_from(mp, agno, pag) \
> > > -	for_each_perag_range((mp), (agno), (mp)->m_sb.sb_agcount, (pag))
> > > +	for_each_perag_range((mp), (agno), (mp)->m_sb.sb_agcount - 1, (pag))
> > 
> > Isn't this one line the entire bug fix right here? i.e. the
> > factoring is largely unnecessary, the grow race bug is fixed by just
> > this one-liner?
> > 
> 
> No, the reference count problems can still occur regardless of this
> particular change.

Ok, so there are two problems then - one is an off-by one that can
result in perag lookups beyond the EOFS racing with growfs instead
of returning NULL.

The other is loop termination via loop index limits can leak a perag
when terminating the walk on the end index rather than a NULL. I
think these two bug fixes should at least be separate patches. Fix
the off-by-one in one patch, fix the termination issue in another.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux