On Wed, Nov 14, 2012 at 03:40:37PM -0800, David Rientjes wrote: > On Wed, 7 Nov 2012, Kirill A. Shutemov wrote: > > > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> > > > > H. Peter Anvin doesn't like huge zero page which sticks in memory forever > > after the first allocation. Here's implementation of lockless refcounting > > for huge zero page. > > > > We have two basic primitives: {get,put}_huge_zero_page(). They > > manipulate reference counter. > > > > If counter is 0, get_huge_zero_page() allocates a new huge page and > > takes two references: one for caller and one for shrinker. We free the > > page only in shrinker callback if counter is 1 (only shrinker has the > > reference). > > > > put_huge_zero_page() only decrements counter. Counter is never zero > > in put_huge_zero_page() since shrinker holds on reference. > > > > Freeing huge zero page in shrinker callback helps to avoid frequent > > allocate-free. > > > > Refcounting has cost. On 4 socket machine I observe ~1% slowdown on > > parallel (40 processes) read page faulting comparing to lazy huge page > > allocation. I think it's pretty reasonable for synthetic benchmark. > > > > Eek, this is disappointing that we need to check a refcount before > referencing the zero huge page No we don't. It's parallel *read* page fault benchmark meaning we map/unmap huge zero page all the time. So it's pure synthetic test to show refcounting overhead. If we see only 1% overhead on the synthetic test we will not see it in real world workloads. > and it obviously shows in your benchmark > (which I consider 1% to be significant given the alternative is 2MB of > memory for a system where thp was enabled to be on). I think it would be > much better to simply allocate and reference the zero huge page locklessly > when thp is enabled to be either "madvise" or "always", i.e. allocate it > when enabled. -- Kirill A. Shutemov
Attachment:
signature.asc
Description: Digital signature