On Thu, Aug 25, 2022 at 05:33:49PM +0200, Vlastimil Babka wrote: > On 8/25/22 17:00, Joel Fernandes wrote: > > On Thu, Aug 25, 2022 at 10:14:16AM +0200, Vlastimil Babka wrote: > >> > >> On 8/25/22 09:50, Vlastimil Babka wrote: > >> > > >> > On 8/25/22 09:13, Vlastimil Babka wrote: > >> >> On 8/25/22 04:56, Joel Fernandes wrote: > >> >>> Hi, > >> >>> I am trying to add some debug information to rcu_head for some patches, and need > >> >>> your help! At least this used to work some time ago (> 1 year). > >> >> > >> >> Hmm, got a patch of how it worked back then? Because what I see (in v5.12) > >> >> struct page definition: > >> >> > >> >> struct rcu_head rcu_head; > >> >> > >> >> is in union with the whole of > >> >> > >> >> struct { /* slab, slob and slub */ ... } > >> >> > >> >> which starts with > >> >> > >> >> union { struct list_head slab_list; ... } > >> >> struct kmem_cache *slab_cache; /* not slob */ > >> >> > >> >> and there's e.g. kmem_rcu_free() in mm/slab.c that does > >> >> > >> >> page = container_of(head, struct page, rcu_head); > >> >> cachep = page->slab_cache; > >> >> > >> >> and very similar thing in mm/slub.c rcu_free_slab() - we assume that the > >> >> rcu_head and slab_cache fields can coexist. > >> >> > >> >> So this looks silently fragile to me? In case rcu_head becomes larger than > >> >> list_head, e.g. with your buf[4], it would start overlaping the > >> >> page->slab_cache field, no? > >> >> > >> >> AFAICS the way it's structured now in struct slab in mm/slab.h, it makes the > >> >> overlap impossible, thus the struct slab will grow as rcu_head grows, and > >> >> offsets of the later fields stop matching struct page counterparts, which > >> >> doesn't grow. Hmm. > >> > > >> > The following has made things compile for both SLUB and SLAB (didn't adjust > >> > SLOB), the idea is to define more explicitly that everything can overlap > >> > with rcu_head except slab_cache. So the structures don't even grow in the > >> > end. Which requires moving slab_cache field around. Which would be fine > >> > except for SLUB and its cmpxchg_double usage, which requires the > >> > freelist+counter fields to be 128bit aligned, which means we can't fit it > >> > all with these constraints. So that's broken with the patch and I don't see > >> > an easy way to fix this. > >> > >> Ah maybe I found a way. It requires that rcu_head in struct slab would be at > >> different offset in struct page, and we drop the particular SLAB_MATCH > >> check. I think it should be fine as we don't cast struct slab to struct page > >> in the rcu calls/callbacks so it was probably there just for intermediate > >> steps of the struct slab series, and can be dropped now. > >> > >> So this should allow you to use up to 32 bytes of rcu_head. > > > > Thanks a lot, Vlastimil! I can confirm it builds, some comments below: > > > >> ----8<---- > >> diff --git a/include/linux/types.h b/include/linux/types.h > >> index ea8cf60a8a79..daf7682a7a3b 100644 > >> --- a/include/linux/types.h > >> +++ b/include/linux/types.h > >> @@ -220,6 +220,7 @@ struct ustat { > >> struct callback_head { > >> struct callback_head *next; > >> void (*func)(struct callback_head *head); > >> + char buf[4]; > >> } __attribute__((aligned(sizeof(void *)))); > >> #define rcu_head callback_head > >> > >> diff --git a/mm/slab.h b/mm/slab.h > >> index 4ec82bec15ec..dd49851f9814 100644 > >> --- a/mm/slab.h > >> +++ b/mm/slab.h > >> @@ -12,37 +12,45 @@ struct slab { > >> #if defined(CONFIG_SLAB) > >> > >> union { > >> - struct list_head slab_list; > >> + struct { > >> + struct list_head slab_list; > >> + void *freelist; /* array of free object indexes */ > >> + void *s_mem; /* first object */ > >> + }; > >> struct rcu_head rcu_head; > > > > Does this present any problems for SLAB_TYPESAFE_BY_RCU such as the slab > > being in use before the end of a grace-period? I was concerned about the > > freelist being corrupted by the rcu_head storage, since its now in a union > > with it. If this does not matter, then that's all good. > > Good point, I didn't check that. If we're freeing the slab then nothing > should be using it at that point, including the freelist pointer. Only the > actual object storage is guaranteed not to go away or become overwritten for > SLAB_TYPESAFE_BY_RCU caches. Cool, sounds good. > But we need to check the slab freeing functions themselves. > It looks like SLAB is fine, slab_destroy() there takes care to copy the > freelist pointer away before call_rcu() already and then it only works with > the copy. Also nothing should need s_mem either, in case you need more bytes > for the rcu_head debug. > > SLUB seems also fine as the freeing doesn't try to look at the freelist even > in case of debugging consistency checks being enabled. It could be a problem > in case of a double-free, but that's already a problem anyway. > In case you need more than the 8 bytes of freelist pointer, then overwriting > the next 'counters' union would be a problem for the consistency checks as > it reads slab->objects from there. > But I guess we could do those checks before call_rcu, so the callback would > then only free the slab page. The state expected by those checks doesn't > depend on the grace period expiring. Ok, sounds good, I can run it for a few days on real hardware and see if something breaks or such. The most I can get with your patch is 12 bytes before the FOLIO checks start complainting. I can stick to 8 bytes, and perhaps we can allocate a separate debug object if we need more or something. > >> @@ -66,9 +74,13 @@ struct slab { > >> #define SLAB_MATCH(pg, sl) \ > >> static_assert(offsetof(struct page, pg) == offsetof(struct slab, sl)) > >> SLAB_MATCH(flags, __page_flags); > >> +#ifdef CONFIG_SLUB > >> +SLAB_MATCH(compound_head, slab_cache); /* Ensure bit 0 is clear */ > >> +#else > >> SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */ > >> +#endif > >> #ifndef CONFIG_SLOB > >> -SLAB_MATCH(rcu_head, rcu_head); > >> +// SLAB_MATCH(rcu_head, rcu_head); /* not necessary, hopefully? */ > > > > I think if it helps, I can make this check as a dependency on a default-off > > debug config option. > > It wouldn't help as with this patch, for SLUB the rcu_head would start at > different offset regardless of whether you enable the option to increase its > size. True, unless we maintained 2 versions of the structs based on CONFIGs but that's really ugly. Thanks so much for this patch, its surely helpful for debugging purposes and I look forward to testing it more and providing you any feedback! thanks, - Joel