On 9/29/21 4:21 PM, Mike Kravetz wrote: > On 9/29/21 12:42 PM, Mike Kravetz wrote: >> On 9/24/21 2:36 AM, David Hildenbrand wrote: >>> On 23.09.21 19:53, Mike Kravetz wrote: >>>> When huge page demotion is fully implemented, gigantic pages can be >>>> demoted to a smaller huge page size. For example, on x86 a 1G page >>>> can be demoted to 512 2M pages. However, gigantic pages can potentially >>>> be allocated from CMA. If a gigantic page which was allocated from CMA >>>> is demoted, the corresponding demoted pages needs to be returned to CMA. >>>> >>>> In order to track hugetlb pages that need to be returned to CMA, add the >>>> hugetlb specific flag HPageCma. Flag is set when a huge page is >>>> allocated from CMA and transferred to any demoted pages. Non-gigantic >>>> huge page freeing code checks for the flag and takes appropriate action. >>> >>> Do we really need that flag or couldn't we simply always try cma_release() and fallback to out ordinary freeing-path? >>> >>> IIRC, cma knows exactly if something was allocated via a CMA are and can be free via it. No need for additional tracking usually. >>> >> >> Yes, I think this is possible. >> Initially, I thought the check for whether pages were part of CMA >> involved a mutex or some type of locking. But, it really is >> lightweight. So, should not be in issue calling in every case. > > When modifying the code, I did come across one issue. Sorry I did not > immediately remember this. > > Gigantic pages are allocated as a 'set of pages' and turned into a compound > page by the hugetlb code. They must be restored to a 'set of pages' before > calling cma_release. You can not pass a compound page to cma_release. > Non-gigantic page are allocated from the buddy directly as compound pages. > They are returned to buddy as a compound page. > > So, the issue comes about when freeing a non-gigantic page. We would > need to convert to a 'set of pages' before calling cma_release just to > see if cma_release succeeds. Then, if it fails convert back to a > compound page to call __free_pages. Conversion is somewhat expensive as > we must modify every tail page struct. > > Some possible solutions: > - Create a cma_pages_valid() routine that checks whether the pages > belong to a cma region. Only convert to 'set of pages' if cma_pages_valid > and we know subsequent cma_release will succeed. > - Always convert non-gigantic pages to a 'set of pages' before freeing. > Alternatively, don't allocate compound pages from buddy and just use > the hugetlb gigantic page prep and destroy routines for all hugetlb > page sizes. > - Use some kind of flag as in proposed patch. > > Having hugetlb just allocate a set of pages from buddy is interesting. > This would make the allocate/free code paths for gigantic and > non-gigantic pages align more closely. It may in overall code simplification, > not just for demote. Taking this approach actually got a bit ugly in alloc_and_dissolve_huge_page which is used in migration. Instead, I took the approach of adding a 'cma_pages_valid()' interface to check at page freeing time. Sending out v3 shortly. -- Mike Kravetz