On Thu, Feb 1, 2024 at 11:30 AM Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> wrote: > > From: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> > > The magazine buffers can take gigabytes of kmem memory, dominating all > other allocations. For observability prurpose create named slab cache so > the iova magazine memory overhead can be clearly observed. > > With this change: > > > slabtop -o | head > Active / Total Objects (% used) : 869731 / 952904 (91.3%) > Active / Total Slabs (% used) : 103411 / 103974 (99.5%) > Active / Total Caches (% used) : 135 / 211 (64.0%) > Active / Total Size (% used) : 395389.68K / 411430.20K (96.1%) > Minimum / Average / Maximum Object : 0.02K / 0.43K / 8.00K > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > 244412 244239 99% 1.00K 61103 4 244412K iommu_iova_magazine > 91636 88343 96% 0.03K 739 124 2956K kmalloc-32 > 75744 74844 98% 0.12K 2367 32 9468K kernfs_node_cache > > On this machine it is now clear that magazine use 242M of kmem memory. > > Signed-off-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> > --- > drivers/iommu/iova.c | 57 +++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 54 insertions(+), 3 deletions(-) > > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c > index d30e453d0fb4..617bbc2b79f5 100644 > --- a/drivers/iommu/iova.c > +++ b/drivers/iommu/iova.c > @@ -630,6 +630,10 @@ EXPORT_SYMBOL_GPL(reserve_iova); > > #define IOVA_DEPOT_DELAY msecs_to_jiffies(100) > > +static struct kmem_cache *iova_magazine_cache; > +static unsigned int iova_magazine_cache_users; > +static DEFINE_MUTEX(iova_magazine_cache_mutex); > + > struct iova_magazine { > union { > unsigned long size; > @@ -654,11 +658,51 @@ struct iova_rcache { > struct delayed_work work; > }; > > +static int iova_magazine_cache_init(void) > +{ > + int ret = 0; > + > + mutex_lock(&iova_magazine_cache_mutex); > + > + iova_magazine_cache_users++; > + if (iova_magazine_cache_users > 1) > + goto out_unlock; > + > + iova_magazine_cache = kmem_cache_create("iommu_iova_magazine", > + sizeof(struct iova_magazine), > + 0, SLAB_HWCACHE_ALIGN, NULL); Could this slab cache be merged with a compatible one in the slab code? If this happens, do we still get a separate entry in /proc/slabinfo? It may be useful to use SLAB_NO_MERGE if the purpose is to specifically have observability into this slab cache, but the comments above the flag make me think I may be misunderstanding it.