On Sun, Jan 12, 2025 at 10:54:46AM +0200, Mike Rapoport wrote: > Hi Kirill, > > On Fri, Jan 10, 2025 at 12:36:59PM +0200, Kirill A. Shutemov wrote: > > On Fri, Dec 27, 2024 at 09:28:20AM +0200, Mike Rapoport wrote: > > > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> > > > > > > Change of attributes of the pages may lead to fragmentation of direct > > > mapping over time and performance degradation as result. > > > > > > With current code it's one way road: kernel tries to avoid splitting > > > large pages, but it doesn't restore them back even if page attributes > > > got compatible again. > > > > > > Any change to the mapping may potentially allow to restore large page. > > > > > > Hook up into cpa_flush() path to check if there's any pages to be > > > recovered in PUD_SIZE range around pages we've just touched. > > > > > > CPUs don't like[1] to have to have TLB entries of different size for the > > > same memory, but looks like it's okay as long as these entries have > > > matching attributes[2]. Therefore it's critical to flush TLB before any > > > following changes to the mapping. > > > > > > Note that we already allow for multiple TLB entries of different sizes > > > for the same memory now in split_large_page() path. It's not a new > > > situation. > > > > > > set_memory_4k() provides a way to use 4k pages on purpose. Kernel must > > > not remap such pages as large. Re-use one of software PTE bits to > > > indicate such pages. > > > > > > [1] See Erratum 383 of AMD Family 10h Processors > > > [2] https://lore.kernel.org/linux-mm/1da1b025-cabc-6f04-bde5-e50830d1ecf0@xxxxxxx/ > > > > > > [rppt@xxxxxxxxxx: > > > * s/restore/collapse/ > > > * update formatting per peterz > > > * use 'struct ptdesc' instead of 'struct page' for list of page tables to > > > be freed > > > * try to collapse PMD first and if it succeeds move on to PUD as peterz > > > suggested > > > * flush TLB twice: for changes done in the original CPA call and after > > > collapsing of large pages > > > ] > > > > > > Link: https://lore.kernel.org/all/20200416213229.19174-1-kirill.shutemov@xxxxxxxxxxxxxxx > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > > > Co-developed-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx> > > > Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx> > > > > When I originally attempted this, the patch was dropped because of > > performance regressions. Was it addressed somehow? > > I didn't realize the patch was dropped because of performance regressions, > so I didn't address it. > > Do you remember where did the regressions show up? https://github.com/zen-kernel/zen-kernel/issues/169 My understanding is if userspace somewhat frequently triggers set_memory_* codepath we will get a performance hit. -- Kiryl Shutsemau / Kirill A. Shutemov