On Tue, Sep 28, 2021 at 08:01:36AM +1000, Dave Chinner wrote: > On Mon, Sep 27, 2021 at 11:21:22AM -0700, Darrick J. Wong wrote: > > On Sun, Sep 26, 2021 at 10:47:21AM +1000, Dave Chinner wrote: > > > On Thu, Sep 23, 2021 at 06:27:59PM -0700, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > > > > Now that we have the infrastructure to track the max possible height of > > > > each btree type, we can create a separate slab zone for cursors of each > > > > type of btree. For smaller indices like the free space btrees, this > > > > means that we can pack more cursors into a slab page, improving slab > > > > utilization. > > > > > > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > --- > > > > fs/xfs/libxfs/xfs_btree.c | 12 ++++++------ > > > > fs/xfs/libxfs/xfs_btree.h | 9 +-------- > > > > fs/xfs/xfs_super.c | 33 ++++++++++++++++++++++++--------- > > > > 3 files changed, 31 insertions(+), 23 deletions(-) > > > > > > > > > > > > diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c > > > > index 120280c998f8..3131de9ae631 100644 > > > > --- a/fs/xfs/libxfs/xfs_btree.c > > > > +++ b/fs/xfs/libxfs/xfs_btree.c > > > > @@ -26,7 +26,6 @@ > > > > /* > > > > * Cursor allocation zone. > > > > */ > > > > -kmem_zone_t *xfs_btree_cur_zone; > > > > struct xfs_btree_cur_zone xfs_btree_cur_zones[XFS_BTNUM_MAX] = { > > > > [XFS_BTNUM_BNO] = { .name = "xfs_alloc_btree_cur" }, > > > > [XFS_BTNUM_INO] = { .name = "xfs_ialloc_btree_cur" }, > > > > @@ -364,6 +363,7 @@ xfs_btree_del_cursor( > > > > struct xfs_btree_cur *cur, /* btree cursor */ > > > > int error) /* del because of error */ > > > > { > > > > + struct xfs_btree_cur_zone *bczone = &xfs_btree_cur_zones[cur->bc_btnum]; > > > > int i; /* btree level */ > > > > > > > > /* > > > > @@ -386,10 +386,10 @@ xfs_btree_del_cursor( > > > > kmem_free(cur->bc_ops); > > > > if (!(cur->bc_flags & XFS_BTREE_LONG_PTRS) && cur->bc_ag.pag) > > > > xfs_perag_put(cur->bc_ag.pag); > > > > - if (cur->bc_maxlevels > XFS_BTREE_CUR_ZONE_MAXLEVELS) > > > > + if (cur->bc_maxlevels > bczone->maxlevels) > > > > kmem_free(cur); > > > > else > > > > - kmem_cache_free(xfs_btree_cur_zone, cur); > > > > + kmem_cache_free(bczone->zone, cur); > > > > } > > > > > > > > /* > > > > @@ -5021,12 +5021,12 @@ xfs_btree_alloc_cursor( > > > > { > > > > struct xfs_btree_cur *cur; > > > > unsigned int maxlevels = xfs_btree_maxlevels(mp, btnum); > > > > + struct xfs_btree_cur_zone *bczone = &xfs_btree_cur_zones[btnum]; > > > > > > > > - if (maxlevels > XFS_BTREE_CUR_ZONE_MAXLEVELS) > > > > + if (maxlevels > bczone->maxlevels) > > > > cur = kmem_zalloc(xfs_btree_cur_sizeof(maxlevels), KM_NOFS); > > > > else > > > > - cur = kmem_cache_zalloc(xfs_btree_cur_zone, > > > > - GFP_NOFS | __GFP_NOFAIL); > > > > + cur = kmem_cache_zalloc(bczone->zone, GFP_NOFS | __GFP_NOFAIL); > > > > > > When will maxlevels ever be greater than bczone->maxlevels? Isn't > > > the bczone->maxlevels case always supposed to be the tallest > > > possible height for that btree? > > > > It should never happen, provided that the maxlevels computation and > > verification are all correct. I thought it was important to leave the > > heap allocation in here as a fallback, since the consequence for getting > > the size calculations wrong is corrupt kernel memory. > > I think that this is the wrong approach. Static debug-only testing > of btree size calculations at init time is needed here, not runtime > fallbacks that hide the fact that we got fundamental calculations > wrong. A mistake here should be loud and obvious, not hidden away in > a fallback path that might never, ever be hit in the real world. Agree. I'll add an assert to the functions that compute the per-mount maxlevels. --D > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx