Hi Mike, Please find my comments inline. Mike Kravetz <mike.kravetz@xxxxxxxxxx> writes: > On 06/06/23 10:32, Tarun Sahu wrote: >> >> Hi Mike, >> >> Thanks for your inputs. >> I wanted to know if you find it okay, Can I send it again adding your Reviewed-by? > > Hi Tarun, > > Just a few more comments/questions. > > On 05/15/23 22:38, Tarun Sahu wrote: >> folio_set_order(folio, 0) is used in kernel at two places >> __destroy_compound_gigantic_folio and __prep_compound_gigantic_folio. >> Currently, It is called to clear out the folio->_folio_nr_pages and >> folio->_folio_order. >> >> For __destroy_compound_gigantic_folio: >> In past, folio_set_order(folio, 0) was needed because page->mapping used >> to overlap with _folio_nr_pages and _folio_order. So if these fields were >> left uncleared during freeing gigantic hugepages, they were causing >> "BUG: bad page state" due to non-zero page->mapping. Now, After >> Commit a01f43901cfb ("hugetlb: be sure to free demoted CMA pages to >> CMA") page->mapping has explicitly been cleared out for tail pages. Also, >> _folio_order and _folio_nr_pages no longer overlaps with page->mapping. > > I believe the same logic/reasoning as above also applies to > __prep_compound_gigantic_folio. > Why? > In __prep_compound_gigantic_folio we only call folio_set_order(folio, 0) > in the case of error. If __prep_compound_gigantic_folio fails, the caller > will then call free_gigantic_folio() on the "gigantic page". However, it is > not really a gigantic at this point in time, and we are simply calling > cma_release() or free_contig_range(). > The end result is that I do not believe the existing call to > folio_set_order(folio, 0) in __prep_compound_gigantic_folio is actually > required. ??? No, there is a difference. IIUC, __destroy_compound_gigantic_folio explicitly reset page->mapping for each page of compound page which makes sure, even if in future some fields of struct page/folio overlaps with page->mapping, it won't cause `BUG: bad page state` error. But If we just remove folio_set_order(folio, 0) from __prep_compound_gigantic_folio without moving folio_set_order(folio, order), this will cause extra maintenance overhead to track if page->_folio_order overlaps with page->mapping everytime struct page fields are changed. As in case of overlapping page->mapping will be non-zero. IMHO, To avoid it, moving the folio_set_order(folio, order) after all error checks are done on tail pages. So, _folio_order is either set on success and not set in case of error. (which is the original proposal). But for folio_set_head, I agree the way you suggested below. WDYT? > > If my reasoning above is correct, then we could just have one patch to > remove the folio_set_order(folio, 0) calls and remove special casing for > order 0 in folio_set_order. > > However, I still believe your restructuring of __prep_compound_gigantic_folio, > is of value. I do not believe there is an issue as questioned by Matthew. My > reasoning has been stated previously. We could make changes like the following > to retain the same order of operations in __prep_compound_gigantic_folio and > totally avoid Matthew's question. Totally untested. > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index ea24718db4af..a54fee663cb1 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1950,10 +1950,8 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, > int nr_pages = 1 << order; > struct page *p; > > - __folio_clear_reserved(folio); > - __folio_set_head(folio); > /* we rely on prep_new_hugetlb_folio to set the destructor */ > - folio_set_order(folio, order); > + > for (i = 0; i < nr_pages; i++) { > p = folio_page(folio, i); > > @@ -1969,7 +1967,7 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, > * on the head page when they need know if put_page() is needed > * after get_user_pages(). > */ > - if (i != 0) /* head page cleared above */ > + if (i != 0) /* head page cleared below */ > __ClearPageReserved(p); > /* > * Subtle and very unlikely > @@ -1996,8 +1994,14 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, > } else { > VM_BUG_ON_PAGE(page_count(p), p); > } > - if (i != 0) > + > + if (i == 0) { > + __folio_clear_reserved(folio); > + __folio_set_head(folio); > + folio_set_order(folio, order); With folio_set_head, I agree to this, But does not feel good with folio_set_order as per my above reasoning. WDYT? > + } else { > set_compound_head(p, &folio->page); > + } > } > atomic_set(&folio->_entire_mapcount, -1); > atomic_set(&folio->_nr_pages_mapped, 0); > @@ -2017,7 +2021,6 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, > p = folio_page(folio, j); > __ClearPageReserved(p); > } > - folio_set_order(folio, 0); > __folio_clear_head(folio); > return false; > } > > >> >> struct page { >> ... >> struct address_space * mapping; /* 24 8 */ >> ... >> } >> >> struct folio { >> ... >> union { >> struct { >> long unsigned int _flags_1; /* 64 8 */ >> long unsigned int _head_1; /* 72 8 */ >> unsigned char _folio_dtor; /* 80 1 */ >> unsigned char _folio_order; /* 81 1 */ >> >> /* XXX 2 bytes hole, try to pack */ >> >> atomic_t _entire_mapcount; /* 84 4 */ >> atomic_t _nr_pages_mapped; /* 88 4 */ >> atomic_t _pincount; /* 92 4 */ >> unsigned int _folio_nr_pages; /* 96 4 */ >> }; /* 64 40 */ >> struct page __page_1 __attribute__((__aligned__(8))); /* 64 64 */ >> } >> ... >> } > > I do not think the copy of page/folio definitions adds much value to the > commit message. Yeah, Will remove it. > > -- > Mike Kravetz