On Mon, Nov 16, 2020 at 9:33 PM Oscar Salvador <osalvador@xxxxxxx> wrote: > > On Fri, Nov 13, 2020 at 06:59:35PM +0800, Muchun Song wrote: > > If the size of HugeTLB page is 2MB, we need 512 struct page structures > > (8 pages) to be associated with it. As far as I know, we only use the > > first 4 struct page structures. Use of first 4 struct page structures > > comes from HUGETLB_CGROUP_MIN_ORDER. > > Once you mention 2MB HugeTLB page and its specific I would also mention > 1GB HugeTLB pages, maybe something along these lines. > I would supress "As far as I know", we __know__ that we only use > the first 4 struct page structures to track metadata information. Thanks. Will do. > > > +/* > > + * There are 512 struct page structures(8 pages) associated with each 2MB > > + * hugetlb page. For tail pages, the value of compound_head is the same. > > + * So we can reuse first page of tail page structures. We map the virtual > > + * addresses of the remaining 6 pages of tail page structures to the first > > + * tail page struct, and then free these 6 pages. Therefore, we need to > > + * reserve at least 2 pages as vmemmap areas. > > + */ > > +#define RESERVE_VMEMMAP_NR 2U > > Either I would include 1GB specific there as well, or I would not add > any specifics at all and just go by saying that first two pages are used, > and the rest can be remapped to the first page that contains the tails. Thanks. Will do. > > > > +void __init hugetlb_vmemmap_init(struct hstate *h) > > +{ > > + unsigned int order = huge_page_order(h); > > + unsigned int vmemmap_pages; > > + > > + vmemmap_pages = ((1 << order) * sizeof(struct page)) >> PAGE_SHIFT; > > + /* > > + * The head page and the first tail page are not to be freed to buddy > > + * system, the others page will map to the first tail page. So there > "the remaining pages" might be more clear. Thanks. > > > + * are (@vmemmap_pages - RESERVE_VMEMMAP_NR) pages can be freed. > "that can be freed" Thanks. > > > + * > > + * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? This is > > + * not expected to happen unless the system is corrupted. So on the > > + * safe side, it is only a safety net. > > + */ > > + if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR)) > > + h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR; > > + else > > + h->nr_free_vmemmap_pages = 0; > > This made think of something. > Since struct hstate hstates is global, all the fields should be defined to 0. > So, the following assignments in hugetlb_add_hstate: > > h->nr_huge_pages = 0; > h->free_huge_pages = 0; > > should not be needed. > Actually, we do not initialize other values like resv_huge_pages > or surplus_huge_pages. > > If that is the case, the "else" could go. Yeah, I agree with you. > > Mike? > > The changes itself look good to me. > I think that putting all the vemmap stuff into hugetlb-vmemmap.* was > the right choice. > > > -- > Oscar Salvador > SUSE L3 -- Yours, Muchun