On Fri, Dec 27, 2024 at 09:28:20AM +0200, Mike Rapoport wrote: > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> > > Change of attributes of the pages may lead to fragmentation of direct > mapping over time and performance degradation as result. > > With current code it's one way road: kernel tries to avoid splitting > large pages, but it doesn't restore them back even if page attributes > got compatible again. > > Any change to the mapping may potentially allow to restore large page. > > Hook up into cpa_flush() path to check if there's any pages to be > recovered in PUD_SIZE range around pages we've just touched. > > CPUs don't like[1] to have to have TLB entries of different size for the > same memory, but looks like it's okay as long as these entries have > matching attributes[2]. Therefore it's critical to flush TLB before any > following changes to the mapping. > > Note that we already allow for multiple TLB entries of different sizes > for the same memory now in split_large_page() path. It's not a new > situation. > > set_memory_4k() provides a way to use 4k pages on purpose. Kernel must > not remap such pages as large. Re-use one of software PTE bits to > indicate such pages. > > [1] See Erratum 383 of AMD Family 10h Processors > [2] https://lore.kernel.org/linux-mm/1da1b025-cabc-6f04-bde5-e50830d1ecf0@xxxxxxx/ > > [rppt@xxxxxxxxxx: > * s/restore/collapse/ > * update formatting per peterz > * use 'struct ptdesc' instead of 'struct page' for list of page tables to > be freed > * try to collapse PMD first and if it succeeds move on to PUD as peterz > suggested > * flush TLB twice: for changes done in the original CPA call and after > collapsing of large pages > ] > > Link: https://lore.kernel.org/all/20200416213229.19174-1-kirill.shutemov@xxxxxxxxxxxxxxx > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > Co-developed-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx> When I originally attempted this, the patch was dropped because of performance regressions. Was it addressed somehow? -- Kiryl Shutsemau / Kirill A. Shutemov