On 6/19/24 11:51 AM, Uladzislau Rezki wrote: > On Tue, Jun 18, 2024 at 09:48:49AM -0700, Paul E. McKenney wrote: >> On Tue, Jun 18, 2024 at 11:31:00AM +0200, Uladzislau Rezki wrote: >> > > On 6/17/24 8:42 PM, Uladzislau Rezki wrote: >> > > >> + >> > > >> + s = container_of(work, struct kmem_cache, async_destroy_work); >> > > >> + >> > > >> + // XXX use the real kmem_cache_free_barrier() or similar thing here >> > > > It implies that we need to introduce kfree_rcu_barrier(), a new API, which i >> > > > wanted to avoid initially. >> > > >> > > I wanted to avoid new API or flags for kfree_rcu() users and this would >> > > be achieved. The barrier is used internally so I don't consider that an >> > > API to avoid. How difficult is the implementation is another question, >> > > depending on how the current batching works. Once (if) we have sheaves >> > > proven to work and move kfree_rcu() fully into SLUB, the barrier might >> > > also look different and hopefully easier. So maybe it's not worth to >> > > invest too much into that barrier and just go for the potentially >> > > longer, but easier to implement? >> > > >> > Right. I agree here. If the cache is not empty, OK, we just defer the >> > work, even we can use a big 21 seconds delay, after that we just "warn" >> > if it is still not empty and leave it as it is, i.e. emit a warning and >> > we are done. >> > >> > Destroying the cache is not something that must happen right away. >> >> OK, I have to ask... >> >> Suppose that the cache is created and destroyed by a module and >> init/cleanup time, respectively. Suppose that this module is rmmod'ed >> then very quickly insmod'ed. >> >> Do we need to fail the insmod if the kmem_cache has not yet been fully >> cleaned up? If not, do we have two versions of the same kmem_cache in >> /proc during the overlap time? >> > No fail :) If same cache is created several times, its s->refcount gets > increased, so, it does not create two entries in the "slabinfo". But i > agree that your point is good! We need to be carefully with removing and > simultaneous creating. Note that this merging may be disabled or not happen due to various flags on the cache being incompatible with it. And I want to actually make sure it never happens for caches being already destroyed as that would lead to use-after-free (the workfn doesn't recheck the refcount in case a merge would happen during the grace period) --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -150,9 +150,10 @@ int slab_unmergeable(struct kmem_cache *s) #endif /* - * We may have set a slab to be unmergeable during bootstrap. + * We may have set a cache to be unmergeable during bootstrap. + * 0 is for cache being destroyed asynchronously */ - if (s->refcount < 0) + if (s->refcount <= 0) return 1; return 0; > From the first glance, there is a refcounter and a global "slab_mutex" > which is used to protect a critical section. Destroying is almost fully > protected(as noted above, by a global mutex) with one exception, it is: > > static void kmem_cache_release(struct kmem_cache *s) > { > if (slab_state >= FULL) { > sysfs_slab_unlink(s); > sysfs_slab_release(s); > } else { > slab_kmem_cache_release(s); > } > } > > this one can race, IMO. > > -- > Uladzislau Rezki