Re: slub: [RFC] free slabs without holding locks.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 7 Jul 2011, Pekka Enberg wrote:

> > Just saw the slab lockdep problem.

Is the lockdep output available for inclusion in the changelog?

> > We can free from slub without holding
> > any locks. I guess something similar can be done for slab but it would be
> > more complicated given the nesting level of free_block(). Not sure if this
> > brings us anything but it does not look like this is doing anything
> > negative to the performance of the allocator.
> > 
> > 
> > 
> > Subject: slub: free slabs without holding locks.
> > 
> > There are two situations in which slub holds a lock while releasing
> > pages:
> > 
> > 	A. During kmem_cache_shrink()
> > 	B. During kmem_cache_close()
> > 
> > For both situations build a list while holding the lock and then
> > release the pages later. Both functions are not performance critical.
> > 
> > After this patch all invocations of free operations are done without
> > holding any locks.
> > 
> > Signed-off-by: Christoph Lameter <cl@xxxxxxxxx>
> 
> Seems reasonable. David, would you mind taking a look at this?
> 

Sorry for the delay!

> > 
> > ---
> >  mm/slub.c |   49 +++++++++++++++++++++++++------------------------
> >  1 file changed, 25 insertions(+), 24 deletions(-)
> > 
> > Index: linux-2.6/mm/slub.c
> > ===================================================================
> > --- linux-2.6.orig/mm/slub.c	2011-06-20 15:23:38.000000000 -0500
> > +++ linux-2.6/mm/slub.c	2011-06-20 16:11:44.572587454 -0500
> > @@ -2657,18 +2657,22 @@ static void free_partial(struct kmem_cac
> >  {
> >  	unsigned long flags;
> >  	struct page *page, *h;
> > +	LIST_HEAD(empty);
> > 
> >  	spin_lock_irqsave(&n->list_lock, flags);
> >  	list_for_each_entry_safe(page, h, &n->partial, lru) {
> > -		if (!page->inuse) {
> > -			__remove_partial(n, page);
> > -			discard_slab(s, page);
> > -		} else {
> > -			list_slab_objects(s, page,
> > -				"Objects remaining on kmem_cache_close()");
> > -		}
> > +		if (!page->inuse)
> > +			list_move(&page->lru, &empty);
> >  	}
> >  	spin_unlock_irqrestore(&n->list_lock, flags);
> > +
> > +	list_for_each_entry_safe(page, h, &empty, lru)
> > +		discard_slab(s, page);
> > +
> > +	if (!list_empty(&n->partial))
> > +		list_for_each_entry(page, &n->partial, lru)
> > +			list_slab_objects(s, page,
> > +				"Objects remaining on kmem_cache_close()");
> >  }
> > 
> >  /*

The last iteration to check for any pages remaining on the partial list is 
not safe because partial list manipulation is protected by list_lock.  
That needs to be fixed by testing for page->inuse during the iteration 
while still holding the lock and dropping the later iteration all 
together.

> > @@ -2702,6 +2706,9 @@ void kmem_cache_destroy(struct kmem_cach
> >  	s->refcount--;
> >  	if (!s->refcount) {
> >  		list_del(&s->list);
> > +		sysfs_slab_remove(s);
> > +		up_write(&slub_lock);
> > +
> >  		if (kmem_cache_close(s)) {
> >  			printk(KERN_ERR "SLUB %s: %s called for cache that "
> >  				"still has objects.\n", s->name, __func__);
> > @@ -2709,9 +2716,9 @@ void kmem_cache_destroy(struct kmem_cach
> >  		}
> >  		if (s->flags & SLAB_DESTROY_BY_RCU)
> >  			rcu_barrier();
> > -		sysfs_slab_remove(s);
> > -	}
> > -	up_write(&slub_lock);
> > +		kfree(s);

Why the new kfree() here?  If the refcount is 0, then this should be 
handled when the sysfs entry is released regardless of whether 
sysfs_slab_remove() uses the CONFIG_SYSFS variant or not.  If kfree(s) 
were needed here, we'd be leaking s->name as well.

> > +	} else
> > +		up_write(&slub_lock);
> >  }
> >  EXPORT_SYMBOL(kmem_cache_destroy);
> > 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]