On 06/08/23 15:33, Tarun Sahu wrote: > Hi Mike, > > Please find my comments inline. > > Mike Kravetz <mike.kravetz@xxxxxxxxxx> writes: > > > On 06/06/23 10:32, Tarun Sahu wrote: > >> > >> Hi Mike, > >> > >> Thanks for your inputs. > >> I wanted to know if you find it okay, Can I send it again adding your Reviewed-by? > > > > Hi Tarun, > > > > Just a few more comments/questions. > > > > On 05/15/23 22:38, Tarun Sahu wrote: > >> folio_set_order(folio, 0) is used in kernel at two places > >> __destroy_compound_gigantic_folio and __prep_compound_gigantic_folio. > >> Currently, It is called to clear out the folio->_folio_nr_pages and > >> folio->_folio_order. > >> > >> For __destroy_compound_gigantic_folio: > >> In past, folio_set_order(folio, 0) was needed because page->mapping used > >> to overlap with _folio_nr_pages and _folio_order. So if these fields were > >> left uncleared during freeing gigantic hugepages, they were causing > >> "BUG: bad page state" due to non-zero page->mapping. Now, After > >> Commit a01f43901cfb ("hugetlb: be sure to free demoted CMA pages to > >> CMA") page->mapping has explicitly been cleared out for tail pages. Also, > >> _folio_order and _folio_nr_pages no longer overlaps with page->mapping. > > > > I believe the same logic/reasoning as above also applies to > > __prep_compound_gigantic_folio. > > Why? > > In __prep_compound_gigantic_folio we only call folio_set_order(folio, 0) > > in the case of error. If __prep_compound_gigantic_folio fails, the caller > > will then call free_gigantic_folio() on the "gigantic page". However, it is > > not really a gigantic at this point in time, and we are simply calling > > cma_release() or free_contig_range(). > > The end result is that I do not believe the existing call to > > folio_set_order(folio, 0) in __prep_compound_gigantic_folio is actually > > required. ??? > No, there is a difference. IIUC, __destroy_compound_gigantic_folio > explicitly reset page->mapping for each page of compound page which > makes sure, even if in future some fields of struct page/folio overlaps > with page->mapping, it won't cause `BUG: bad page state` error. But If we > just remove folio_set_order(folio, 0) from __prep_compound_gigantic_folio > without moving folio_set_order(folio, order), this will cause extra > maintenance overhead to track if page->_folio_order overlaps with > page->mapping everytime struct page fields are changed. As in case of > overlapping page->mapping will be non-zero. IMHO, To avoid it, > moving the folio_set_order(folio, order) after all error checks are > done on tail pages. So, _folio_order is either set on success and not > set in case of error. (which is the original proposal). But for > folio_set_head, I agree the way you suggested below. > > WDYT? Right. It is more 'future proof' to only set folio order on success as done in your original patch. > > > > If my reasoning above is correct, then we could just have one patch to > > remove the folio_set_order(folio, 0) calls and remove special casing for > > order 0 in folio_set_order. > > > > However, I still believe your restructuring of __prep_compound_gigantic_folio, > > is of value. I do not believe there is an issue as questioned by Matthew. My > > reasoning has been stated previously. We could make changes like the following > > to retain the same order of operations in __prep_compound_gigantic_folio and > > totally avoid Matthew's question. Totally untested. > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index ea24718db4af..a54fee663cb1 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -1950,10 +1950,8 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, > > int nr_pages = 1 << order; > > struct page *p; > > > > - __folio_clear_reserved(folio); > > - __folio_set_head(folio); > > /* we rely on prep_new_hugetlb_folio to set the destructor */ > > - folio_set_order(folio, order); > > + > > for (i = 0; i < nr_pages; i++) { > > p = folio_page(folio, i); > > > > @@ -1969,7 +1967,7 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, > > * on the head page when they need know if put_page() is needed > > * after get_user_pages(). > > */ > > - if (i != 0) /* head page cleared above */ > > + if (i != 0) /* head page cleared below */ > > __ClearPageReserved(p); > > /* > > * Subtle and very unlikely > > @@ -1996,8 +1994,14 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, > > } else { > > VM_BUG_ON_PAGE(page_count(p), p); > > } > > - if (i != 0) > > + > > + if (i == 0) { > > + __folio_clear_reserved(folio); > > + __folio_set_head(folio); > > + folio_set_order(folio, order); > With folio_set_head, I agree to this, But does not feel good with > folio_set_order as per my above reasoning. WDYT? Agree with your reasoning. We should just move __folio_set_head and folio_set_order after the loop as you originally suggested. > > > + } else { > > set_compound_head(p, &folio->page); > > + } > > } > > atomic_set(&folio->_entire_mapcount, -1); > > atomic_set(&folio->_nr_pages_mapped, 0); > > @@ -2017,7 +2021,6 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, > > p = folio_page(folio, j); > > __ClearPageReserved(p); > > } > > - folio_set_order(folio, 0); > > __folio_clear_head(folio); > > return false; > > } > > > > > >> > >> struct page { > >> ... > >> struct address_space * mapping; /* 24 8 */ > >> ... > >> } > >> > >> struct folio { > >> ... > >> union { > >> struct { > >> long unsigned int _flags_1; /* 64 8 */ > >> long unsigned int _head_1; /* 72 8 */ > >> unsigned char _folio_dtor; /* 80 1 */ > >> unsigned char _folio_order; /* 81 1 */ > >> > >> /* XXX 2 bytes hole, try to pack */ > >> > >> atomic_t _entire_mapcount; /* 84 4 */ > >> atomic_t _nr_pages_mapped; /* 88 4 */ > >> atomic_t _pincount; /* 92 4 */ > >> unsigned int _folio_nr_pages; /* 96 4 */ > >> }; /* 64 40 */ > >> struct page __page_1 __attribute__((__aligned__(8))); /* 64 64 */ > >> } > >> ... > >> } > > > > I do not think the copy of page/folio definitions adds much value to the > > commit message. > Yeah, Will remove it. > > I think we are finally on the same page. I am good with this v2 patch. Only change is to update commit message to remove the definitions. -- Mike Kravetz