On Sun, Jan 21, 2024 at 06:39:26PM -0500, Pasha Tatashin wrote: > On Wed, May 10, 2023 at 12:28 PM Kent Overstreet > <kent.overstreet@xxxxxxxxx> wrote: > > Hasn't been addressed yet, but we were just talking about moving the > > codetag pointer from page_ext to page last night for memory overhead > > reasons. > > > > The disadvantage then is that the memory overhead doesn't go down if you > > disable memory allocation profiling at boot time... > > > > But perhaps the performance overhead is low enough now that this is not > > something we expect to be doing as much? > > > > Choices, choices... > > I would like to participate in this discussion, specifically to Umm, this is a discussion proposal for last year, not this. I don't remember if a followup discussion has been proposed for this year? > 2. Reducing the memory overhead by not using page_ext pointer, but > instead use n-bits in the page->flags. > > The number of buckets is actually not that large, there is no need to > keep 8-byte pointer in page_ext, it could be an idx in an array of a > specific size. There could be buckets that contain several stacks. There are a lot of people using "n bits in page->flags" and I don't have a good feeling for how many we really have left. MGLRU uses a variable number of bits. There's PG_arch_2 and PG_arch_3. There's PG_uncached. There's PG_young and PG_idle. And of course we have NUMA node (10 bits?), section (?), zone (3 bits?) I count 28 bits allocated with all the CONFIG enabled, then 13 for node+zone, so it certainly seems like there's a lot free on 64-bit, but it'd be nice to have it written out properly. Related, what do we think is going to happen with page_ext in a memdesc world (also what's going to happen with the kmsan goop in struct page?) I see page_idle_ops, page_owner_ops and page_table_check_ops. page_idle_ops only uses the 8 byte flags. page_owner_ops uses an extra 64 bytes (!). page_table_check uses an extra 8 bytes. page_idle looks to be for folios only. page_table_check seems like it should be folded into pgdesc. page_owner maybe gets added to every allocation rather than every page (but that's going to be interesting for memdescs which don't normally need an allocation). That seems to imply that we can get rid of page_ext entirely, which will be nice. I don't understand kmsan well enough to understand what to do about it. If it's per-allocation, we can handle it like page_owner. If it really is per-page, we can make it an ifdef in struct page itself. I think it's OK to grow struct page for such a rarely used debugging option.