On Mon, Aug 12, 2024 at 08:31:01AM +0200, Christoph Hellwig wrote: > Convert the perag lookup from the legacy radix tree to the xarray, > which allows for much nicer iteration and bulk lookup semantics. > > Note that this removes the helpers for tagged get and grab and the > for_each* wrappers built around them and instead uses the xa_for_each* > iteration helpers directly in xfs_icache.c, which simplifies the code > nicely. Can we split the implementation change and the API change into two separate patches, please? I have no problems with the xarray conversion, I have reservations about the API changes. I hav eno idea what the new iteration method that is needed looks like, but I'd prefer not to be exposing all the perag locking and reference counting semantics all over the code - the current iterators were introduced to remove all that stuff from existing iterators. This patch makes iteration go back to this model: rcu_read_lock() xa_for_each....() { /* get active or passive ref count */ .... rcu_read_unlock(); /* do work on AG */ /* put/rele perag */ /* take RCU lock for next perag lookup */ rcu_read_lock(); } rcu_read_unlock(); And that feels like a step backward from an API perspective, not an improvement.... So what's the overall plan for avoiding this sort of mess everywhere? Can we re-implement the existing iterators more efficiently with xarray iterators, or does xarray-based iteration require going back to the old way of open coding all iterations? > @@ -1493,21 +1497,32 @@ xfs_blockgc_flush_all( > struct xfs_mount *mp) > { > struct xfs_perag *pag; > - xfs_agnumber_t agno; > + unsigned long index = 0; > > trace_xfs_blockgc_flush_all(mp, __return_address); > > /* > - * For each blockgc worker, move its queue time up to now. If it > - * wasn't queued, it will not be requeued. Then flush whatever's > - * left. > + * For each blockgc worker, move its queue time up to now. If it wasn't > + * queued, it will not be requeued. Then flush whatever is left. > */ > - for_each_perag_tag(mp, agno, pag, XFS_ICI_BLOCKGC_TAG) > - mod_delayed_work(pag->pag_mount->m_blockgc_wq, > - &pag->pag_blockgc_work, 0); > + rcu_read_lock(); > + xa_for_each_marked(&mp->m_perags, index, pag, XFS_ICI_BLOCKGC_TAG) > + mod_delayed_work(mp->m_blockgc_wq, &pag->pag_blockgc_work, 0); > + rcu_read_unlock(); > > + index = 0; > + rcu_read_lock(); > + xa_for_each_marked(&mp->m_perags, index, pag, XFS_ICI_BLOCKGC_TAG) { > + if (!atomic_inc_not_zero(&pag->pag_active_ref)) > + continue; > + rcu_read_unlock(); > > - for_each_perag_tag(mp, agno, pag, XFS_ICI_BLOCKGC_TAG) > flush_delayed_work(&pag->pag_blockgc_work); > + xfs_perag_rele(pag); > + > + rcu_read_lock(); > + } > + rcu_read_unlock(); And this is the whole problem with open coding iterators. The first iterator accesses perag structures and potentially queues them for work without holding a valid reference to the perag. The second iteration takes reference counts, so can access the perag safely. Why are these two iterations different? What makes the first, non-reference counted iteration safe? > > return xfs_inodegc_flush(mp); > } > @@ -1755,18 +1770,26 @@ xfs_icwalk( > struct xfs_perag *pag; > int error = 0; > int last_error = 0; > - xfs_agnumber_t agno; > + unsigned long index = 0; > + > + rcu_read_lock(); > + xa_for_each_marked(&mp->m_perags, index, pag, goal) { > + if (!atomic_inc_not_zero(&pag->pag_active_ref)) > + continue; > + rcu_read_unlock(); > > - for_each_perag_tag(mp, agno, pag, goal) { > error = xfs_icwalk_ag(pag, goal, icw); > + xfs_perag_rele(pag); > + > + rcu_read_lock(); > if (error) { > last_error = error; > - if (error == -EFSCORRUPTED) { > - xfs_perag_rele(pag); > + if (error == -EFSCORRUPTED) > break; > - } > } > } > + rcu_read_unlock(); > + And there's the open coded pattern I talked about earlier that we introduced the for_each_perag iterators to avoid. Like I said, converting to xarray - no problems with that. Changing the iterator API - doesn't seem like a step forwards right now. -Dave. -- Dave Chinner david@xxxxxxxxxxxxx