Hi Dave, On Thu, Apr 15, 2021 at 04:28:24PM +1000, Dave Chinner wrote: > On Thu, Apr 15, 2021 at 12:28:37PM +0800, Gao Xiang wrote: > > Hi Dave, > > > > On Thu, Apr 15, 2021 at 01:42:55PM +1000, Dave Chinner wrote: > > > On Thu, Apr 15, 2021 at 03:52:37AM +0800, Gao Xiang wrote: > > > > To get rid of paralleled requests related to AGs which are pending > > > > for shrinking, mark these perags as inactive rather than playing > > > > with per-ag structures theirselves. > > > > > > > > Since in that way, a per-ag lock can be used to stablize the inactive > > > > status together with agi/agf buffer lock (which is much easier than > > > > adding more complicated perag_{get, put} pairs..) Also, Such per-ags > > > > can be released / reused when unmountfs / growfs. > > > > > > > > On the read side, pag_inactive_rwsem can be unlocked immediately after > > > > the agf or agi buffer lock is acquired. However, pag_inactive_rwsem > > > > can only be unlocked after the agf/agi buffer locks are all acquired > > > > with the inactive status on the write side. > > > > > > > > XXX: maybe there are some missing cases. > > > > > > > > Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxx> > > > > --- > > > > fs/xfs/libxfs/xfs_ag.c | 16 +++++++++++++--- > > > > fs/xfs/libxfs/xfs_alloc.c | 12 +++++++++++- > > > > fs/xfs/libxfs/xfs_ialloc.c | 26 +++++++++++++++++++++++++- > > > > fs/xfs/xfs_mount.c | 2 ++ > > > > fs/xfs/xfs_mount.h | 6 ++++++ > > > > 5 files changed, 57 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c > > > > index c68a36688474..ba5702e5c9ad 100644 > > > > --- a/fs/xfs/libxfs/xfs_ag.c > > > > +++ b/fs/xfs/libxfs/xfs_ag.c > > > > @@ -676,16 +676,24 @@ xfs_ag_get_geometry( > > > > if (agno >= mp->m_sb.sb_agcount) > > > > return -EINVAL; > > > > > > > > + pag = xfs_perag_get(mp, agno); > > > > + down_read(&pag->pag_inactive_rwsem); > > > > > > No need to encode the lock type in the lock name. We know it's a > > > rwsem from the lock API functions... > > > > > > > + > > > > + if (pag->pag_inactive) { > > > > + error = -EBUSY; > > > > + up_read(&pag->pag_inactive_rwsem); > > > > + goto out; > > > > + } > > > > > > This looks kinda heavyweight. Having to take a rwsem whenever we do > > > a perag lookup to determine if we can access the perag completely > > > defeats the purpose of xfs_perag_get() being a lightweight, lockless > > > operation. > > > > I'm not sure if it has some regression since write lock will be only > > taken when shrinking (shrinking is a rare operation), for most cases > > which is much similiar to perag radix root I think. > > It's still an extra pair of atomic operation per xfs_perag_get/put() > call pairs. pag_inactive being true is the slow/rare path, so we > should be trying to keep the overhead of detecting that path out of > all our fast paths... > > Indeed, xfs_perag_get() already shows up on profiles because of the > number of calls we make to it, so adding an extra atomic operation > for this operation is going to be noticable in terms of CPU usage, > if nothing else. > > > The locking logic is that, when pag->pag_inactive = false -> true, > > the write lock, AGF/AGI locks all have to be taken in advance. > > > > > > > > I suspect what we really want here is active/passive references like > > > are used for the superblock, and an API that hides the > > > implementation from all the callers. > > > > If my understanding is correct, my own observation these months is > > that the current XFS codebase is not well suitable to accept !pag > > (due to many logic assumes pag structure won't go away, since some > > are indexed/passed by agno rather than some pag reference count). > > Maybe so, but that's exactly what this patch is addressing - it's > adding a way to detect that perag has "gone away"i and should not be > referenced any more. > > It wasn't until I saw this patch that I realised that there is a way > that we can, in fact, safely handle perags being freed by ensuring > that the RCU lookup fails for these "active" references.... > > > Even I think we could introduce some active references, but handle > > the cover range is still a big project. The current approach assumes > > pag won't go away except for umounting and blocks allocation / imap/ > > ... paths to access that. > > The way I described should work just fine - nobody should be > accessing the per-ag without a reference gained through a lookup in > some way. Buffers carry a passive reference, because the per-ag > teardown will do the teardown of those references before the perag > is freed. Everything else carries an active reference and so > teardown cannot begin until all active references go away. > > > My current thought is that we could implement it in that way as the > > first step (in order to land the shrinking functionality to let > > end-users benefit of this), and by the codebase evolves, it can be > > transformed to a more gentle way. > > I think converting this patchset to active/passive references ias > I've described solves the problem entirely - there's no "evolving" > needed as we can solve it with this one structural change... Since currently even xfs_perag_put() reaches zero, it won't free the per-ag anyway (it may just use to mark the pointer is no longer used in the context? not sure what's the exact use of the such pairs), so in practice I think after active/passive references are introduced, there is still the only one real reference count that works for the per-ag lifetime management and currently it doesn't manage whole lifetime at all... So (my own understanding is) I think in practice, that approachs would be somewhat equal to relocate/rearrange xfs_perag_get()/put() pair to manage the perag lifetime instead. and maybe use xfs_perag_ptr() to access perag when some reference count is available. Thanks, Gao Xiang > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx >