On 2/8/22 23:44, Muchun Song wrote: > On Wed, Jan 26, 2022 at 4:04 PM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote: >> >> On Wed, Nov 24, 2021 at 11:09 AM Andrew Morton >> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: >>> >>> On Mon, 22 Nov 2021 12:21:32 +0800 Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote: >>> >>>> On Wed, Nov 10, 2021 at 2:18 PM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote: >>>>> >>>>> On Tue, Nov 9, 2021 at 3:33 AM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: >>>>>> >>>>>> On 11/8/21 12:16 AM, Muchun Song wrote: >>>>>>> On Mon, Nov 1, 2021 at 11:22 AM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote: >>>>>>>> >>>>>>>> This series can minimize the overhead of struct page for 2MB HugeTLB pages >>>>>>>> significantly. It further reduces the overhead of struct page by 12.5% for >>>>>>>> a 2MB HugeTLB compared to the previous approach, which means 2GB per 1TB >>>>>>>> HugeTLB. It is a nice gain. Comments and reviews are welcome. Thanks. >>>>>>>> >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Ping guys. Does anyone have any comments or suggestions >>>>>>> on this series? >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>> >>>>>> I did look over the series earlier. I have no issue with the hugetlb and >>>>>> vmemmap modifications as they are enhancements to the existing >>>>>> optimizations. My primary concern is the (small) increased overhead >>>>>> for the helpers as outlined in your cover letter. Since these helpers >>>>>> are not limited to hugetlb and used throughout the kernel, I would >>>>>> really like to get comments from others with a better understanding of >>>>>> the potential impact. >>>>> >>>>> Thanks Mike. I'd like to hear others' comments about this as well. >>>>> From my point of view, maybe the (small) overhead is acceptable >>>>> since it only affects the head page, however Matthew Wilcox's folio >>>>> series could reduce this situation as well. >>> >>> I think Mike was inviting you to run some tests to quantify the >>> overhead ;) >> >> Hi Andrew, >> >> Sorry for the late reply. >> >> Specific overhead figures are already in the cover letter. Also, >> I did some other tests, e.g. kernel compilation, sysbench. I didn't >> see any regressions. > > The overhead is introduced by page_fixed_fake_head() which > has an "if" statement and an access to a possible cold cache line. > I think the main overhead is from the latter. However, probabilistically, > only 1/64 of the pages need to do the latter. And > page_fixed_fake_head() is already simple (I mean the overhead > is small enough) and many performance bottlenecks in mm are > not in compound_head(). This also matches the tests I did. > I didn't see any regressions after enabling this feature. > > I knew Mike's concern is the increased overhead to use cases > beyond HugeTLB. If we really want to avoid the access to > a possible cold cache line, we can introduce a new page > flag like PG_hugetlb and test if it is set in the page->flags, > if so, then return the read head page struct. Then > page_fixed_fake_head() looks like below. > > static __always_inline const struct page *page_fixed_fake_head(const > struct page *page) > { > if (!hugetlb_free_vmemmap_enabled()) > return page; > > if (test_bit(PG_hugetlb, &page->flags)) { > unsigned long head = READ_ONCE(page[1].compound_head); > > if (likely(head & 1)) > return (const struct page *)(head - 1); > } > return page; > } > > But I don't think it's worth doing this. > > Hi Mike and Andrew, > > Since these helpers are not limited to hugetlb and used throughout the > kernel, I would really like to get comments from others with a better > understanding of the potential impact. Do you have any appropriate > reviewers to invite? > I think the appropriate people are already on Cc as they provided input on the original vmemmap optimization series. The question that needs to be answered is simple enough: Is the savings of one vmemmap page per hugetlb page worth the extra minimal overhead in compound_head()? Like most things, this depends on workload. One thing to note is that compound_page() overhead is only introduced if hugetlb vmemmap freeing is enabled. Correct? During the original vmemmap optimization discussions, people thought it important that this be 'opt in'. I do not know if distos will enable this by default. But, perhaps the potential overhead can be thought of as just part of 'opting in' for vmemmap optimizations. -- Mike Kravetz