On Thu, 7 Jul 2011, Pekka Enberg wrote: > > Just saw the slab lockdep problem. Is the lockdep output available for inclusion in the changelog? > > We can free from slub without holding > > any locks. I guess something similar can be done for slab but it would be > > more complicated given the nesting level of free_block(). Not sure if this > > brings us anything but it does not look like this is doing anything > > negative to the performance of the allocator. > > > > > > > > Subject: slub: free slabs without holding locks. > > > > There are two situations in which slub holds a lock while releasing > > pages: > > > > A. During kmem_cache_shrink() > > B. During kmem_cache_close() > > > > For both situations build a list while holding the lock and then > > release the pages later. Both functions are not performance critical. > > > > After this patch all invocations of free operations are done without > > holding any locks. > > > > Signed-off-by: Christoph Lameter <cl@xxxxxxxxx> > > Seems reasonable. David, would you mind taking a look at this? > Sorry for the delay! > > > > --- > > mm/slub.c | 49 +++++++++++++++++++++++++------------------------ > > 1 file changed, 25 insertions(+), 24 deletions(-) > > > > Index: linux-2.6/mm/slub.c > > =================================================================== > > --- linux-2.6.orig/mm/slub.c 2011-06-20 15:23:38.000000000 -0500 > > +++ linux-2.6/mm/slub.c 2011-06-20 16:11:44.572587454 -0500 > > @@ -2657,18 +2657,22 @@ static void free_partial(struct kmem_cac > > { > > unsigned long flags; > > struct page *page, *h; > > + LIST_HEAD(empty); > > > > spin_lock_irqsave(&n->list_lock, flags); > > list_for_each_entry_safe(page, h, &n->partial, lru) { > > - if (!page->inuse) { > > - __remove_partial(n, page); > > - discard_slab(s, page); > > - } else { > > - list_slab_objects(s, page, > > - "Objects remaining on kmem_cache_close()"); > > - } > > + if (!page->inuse) > > + list_move(&page->lru, &empty); > > } > > spin_unlock_irqrestore(&n->list_lock, flags); > > + > > + list_for_each_entry_safe(page, h, &empty, lru) > > + discard_slab(s, page); > > + > > + if (!list_empty(&n->partial)) > > + list_for_each_entry(page, &n->partial, lru) > > + list_slab_objects(s, page, > > + "Objects remaining on kmem_cache_close()"); > > } > > > > /* The last iteration to check for any pages remaining on the partial list is not safe because partial list manipulation is protected by list_lock. That needs to be fixed by testing for page->inuse during the iteration while still holding the lock and dropping the later iteration all together. > > @@ -2702,6 +2706,9 @@ void kmem_cache_destroy(struct kmem_cach > > s->refcount--; > > if (!s->refcount) { > > list_del(&s->list); > > + sysfs_slab_remove(s); > > + up_write(&slub_lock); > > + > > if (kmem_cache_close(s)) { > > printk(KERN_ERR "SLUB %s: %s called for cache that " > > "still has objects.\n", s->name, __func__); > > @@ -2709,9 +2716,9 @@ void kmem_cache_destroy(struct kmem_cach > > } > > if (s->flags & SLAB_DESTROY_BY_RCU) > > rcu_barrier(); > > - sysfs_slab_remove(s); > > - } > > - up_write(&slub_lock); > > + kfree(s); Why the new kfree() here? If the refcount is 0, then this should be handled when the sysfs entry is released regardless of whether sysfs_slab_remove() uses the CONFIG_SYSFS variant or not. If kfree(s) were needed here, we'd be leaking s->name as well. > > + } else > > + up_write(&slub_lock); > > } > > EXPORT_SYMBOL(kmem_cache_destroy); > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>