On Sun, Jan 21, 2024 at 6:54 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Sun, Jan 21, 2024 at 06:31:48PM -0500, Pasha Tatashin wrote: > > On Sun, Jan 21, 2024 at 6:14 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > I can add a proposal for a topic on both the PCP and Buddy allocators > > > (I have a series of Thoughts on how the PCP allocator works in a memdesc > > > world that I haven't written down & sent out yet). > > > > Interesting, given that pcp are mostly allocated by kmalloc and use > > vmalloc for large allocations, how memdesc can be different for them > > compared to regular kmalloc allocations given that they are sub-page? > > Oh! I don't mean the mm/percpu.c allocator. I mean the pcp allocator > in mm/page_alloc.c. Nevermind, this makes perfect sense now :-) > I don't have any Thoughts on mm/percpu.c at this time. I'm vaguely > aware that it exists ;-) > > > > Thee's so much work to be done! And it's mostly parallelisable and almost > > > trivial. It's just largely on the filesystem-page cache interaction, so > > > it's not terribly interesting. See, for example, the ext2, ext4, gfs2, > > > nilfs2, ufs and ubifs patchsets I've done over the past few releases. > > > I have about half of an ntfs3 patchset ready to send. > > > > > There's a bunch of work to be done in DRM to switch from pages to folios > > > due to their use of shmem. You can also grep for 'page->mapping' (because > > > fortunately we aren't too imaginative when it comes to naming variables) > > > and find 270 places that need to be changed. Some are comments, but > > > those still need to be updated! > > > > > > Anything using lock_page(), get_page(), set_page_dirty(), using > > > &folio->page, any of the functions in mm/folio-compat.c needs auditing. > > > We can make the first three of those work, but they're good indicators > > > that the code needs to be looked at. > > > > > > There is some interesting work to be done, and one of the things I'm > > > thinking hard about right now is how we're doing folio conversions > > > that make sense with today's code, and stop making sense when we get > > > to memdescs. That doesn't apply to anything interacting with the page > > > cache (because those are folios now and in the future), but it does apply > > > to one spot in ext4 where it allocates memory from slab and attaches a > > > buffer_head to it ... > > > > There are many more drivers that would need the conversion. For > > example, IOMMU page tables can occupy gigabytes of space, have > > different implementations for AMD, X86, and several ARMs. Conversion > > to memdesc and unifying the IO page table management implementation > > for these platforms would be beneficial. > > Understood; there's a lot of code that can benefit from larger > allocations. I was listing the impediments to shrinking struct page > rather than the places which would most benefit from switching to larger > allocations. They're complementary to a large extent; you can switch > to compound allocations today and get the benefit later. And unifying > implementations is always a worthy project.