Re: [PATCH 3/8] x86/mm/pat: Restore large pages after fragmentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 12, 2025 at 10:54:46AM +0200, Mike Rapoport wrote:
> Hi Kirill,
> 
> On Fri, Jan 10, 2025 at 12:36:59PM +0200, Kirill A. Shutemov wrote:
> > On Fri, Dec 27, 2024 at 09:28:20AM +0200, Mike Rapoport wrote:
> > > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
> > > 
> > > Change of attributes of the pages may lead to fragmentation of direct
> > > mapping over time and performance degradation as result.
> > > 
> > > With current code it's one way road: kernel tries to avoid splitting
> > > large pages, but it doesn't restore them back even if page attributes
> > > got compatible again.
> > > 
> > > Any change to the mapping may potentially allow to restore large page.
> > > 
> > > Hook up into cpa_flush() path to check if there's any pages to be
> > > recovered in PUD_SIZE range around pages we've just touched.
> > > 
> > > CPUs don't like[1] to have to have TLB entries of different size for the
> > > same memory, but looks like it's okay as long as these entries have
> > > matching attributes[2]. Therefore it's critical to flush TLB before any
> > > following changes to the mapping.
> > > 
> > > Note that we already allow for multiple TLB entries of different sizes
> > > for the same memory now in split_large_page() path. It's not a new
> > > situation.
> > > 
> > > set_memory_4k() provides a way to use 4k pages on purpose. Kernel must
> > > not remap such pages as large. Re-use one of software PTE bits to
> > > indicate such pages.
> > > 
> > > [1] See Erratum 383 of AMD Family 10h Processors
> > > [2] https://lore.kernel.org/linux-mm/1da1b025-cabc-6f04-bde5-e50830d1ecf0@xxxxxxx/
> > > 
> > > [rppt@xxxxxxxxxx:
> > >  * s/restore/collapse/
> > >  * update formatting per peterz
> > >  * use 'struct ptdesc' instead of 'struct page' for list of page tables to
> > >    be freed
> > >  * try to collapse PMD first and if it succeeds move on to PUD as peterz
> > >    suggested
> > >  * flush TLB twice: for changes done in the original CPA call and after
> > >    collapsing of large pages
> > > ]
> > > 
> > > Link: https://lore.kernel.org/all/20200416213229.19174-1-kirill.shutemov@xxxxxxxxxxxxxxx
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > > Co-developed-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
> > > Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
> > 
> > When I originally attempted this, the patch was dropped because of
> > performance regressions. Was it addressed somehow?
> 
> I didn't realize the patch was dropped because of performance regressions,
> so I didn't address it.
> 
> Do you remember where did the regressions show up?

https://github.com/zen-kernel/zen-kernel/issues/169

My understanding is if userspace somewhat frequently triggers set_memory_*
codepath we will get a performance hit.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov




[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux