Re: [PATCH v5 5/5] mm/page_alloc: reuse tail struct pages for compound devmaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 11, 2022 at 8:48 PM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote:
>
> On 2/11/22 05:07, Muchun Song wrote:
> > On Fri, Feb 11, 2022 at 3:34 AM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote:
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index cface1d38093..c10df2fd0ec2 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -6666,6 +6666,20 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
> >>         }
> >>  }
> >>
> >> +/*
> >> + * With compound page geometry and when struct pages are stored in ram most
> >> + * tail pages are reused. Consequently, the amount of unique struct pages to
> >> + * initialize is a lot smaller that the total amount of struct pages being
> >> + * mapped. This is a paired / mild layering violation with explicit knowledge
> >> + * of how the sparse_vmemmap internals handle compound pages in the lack
> >> + * of an altmap. See vmemmap_populate_compound_pages().
> >> + */
> >> +static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap,
> >> +                                             unsigned long nr_pages)
> >> +{
> >> +       return !altmap ? 2 * (PAGE_SIZE/sizeof(struct page)) : nr_pages;
> >> +}
> >> +
> >
> > This means only the first 2 pages will be modified, the reset 6 or 4094 pages
> > do not.  In the HugeTLB case, those tail pages are mapped with read-only
> > to catch invalid usage on tail pages (e.g. write operations). Quick question:
> > should we also do similar things on DAX?
> >
> What's sort of in the way of marking deduplicated pages as read-only is one
> particular CONFIG_DEBUG_VM feature, particularly page_init_poison(). HugeTLB
> gets its memory from the page allocator of already has pre-populated (at boot)
> system RAM sections and needs those to be 'given back' before they can be
> hotunplugged. So I guess it never goes through page_init_poison(). Although
> device-dax, the sections are populated and dedicated to device-dax when
> hotplugged, and then on hotunplug when the last user devdax user drops the page
> reference.
>
> So page_init_poison() is called on those two occasions. It actually writes to
> whole sections of memmap, not just one page. So either I gate read-only page
> protection when CONFIG_DEBUG_VM=n (which feels very wrong), or I detect inside
> page_init_poison() that the caller is trying to init compound devmap backed
> struct pages that were already watermarked (i.e. essentially when pfn offset
> between passed page and head page is bigger than 128).

Got it. I haven't realized page_init_poison() will poison the struct pages.
I agree with you that mapping with read-only is wrong.

Thanks.



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux