On Tue, 24 Aug 2021 15:08:46 -0700 Mike Kravetz wrote: >On 8/16/21 6:46 PM, Andrew Morton wrote: >> On Mon, 16 Aug 2021 17:46:58 -0700 Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: >> >>>> It really is a ton of new code. I think we're owed much more detail >>>> about the problem than the above. To be confident that all this >>>> material is truly justified? >>> >>> The desired functionality for this specific use case is to simply >>> convert a 1G huegtlb page to 512 2MB hugetlb pages. As mentioned >>> >>> "Converting larger to smaller hugetlb pages can be accomplished today by >>> first freeing the larger page to the buddy allocator and then allocating >>> the smaller pages. However, there are two issues with this approach: >>> 1) This process can take quite some time, especially if allocation of >>> the smaller pages is not immediate and requires migration/compaction. >>> 2) There is no guarantee that the total size of smaller pages allocated >>> will match the size of the larger page which was freed. This is >>> because the area freed by the larger page could quickly be >>> fragmented." >>> >>> These two issues have been experienced in practice. >> >> Well the first issue is quantifiable. What is "some time"? If it's >> people trying to get a 5% speedup on a rare operation because hey, >> bugging the kernel developers doesn't cost me anything then perhaps we >> have better things to be doing. > >Well, I set up a test environment on a larger system to get some >numbers. My 'load' on the system was filling the page cache with >clean pages. The thought is that these pages could easily be reclaimed. > >When trying to get numbers I hit a hugetlb page allocation stall where >__alloc_pages(__GFP_RETRY_MAYFAIL, order 9) would stall forever (or at >least an hour). It was very much like the symptoms addressed here: >https://lore.kernel.org/linux-mm/20190806014744.15446-1-mike.kravetz@xxxxxxxxxx/ > >This was on 5.14.0-rc6-next-20210820. > >I'll do some more digging as this appears to be some dark corner case of >reclaim and/or compaction. The 'good news' is that I can reproduce >this. I am on vacation until 1 Sep, with an ear on any light on the corner cases. Hillf > >> And the second problem would benefit from some words to help us >> understand how much real-world hurt this causes, and how frequently. >> And let's understand what the userspace workarounds look like, etc. > >The stall above was from doing a simple 'free 1GB page' followed by >'allocate 512 MB pages' from userspace. > >Getting out another version of this series will be delayed, as I think >we need to address or understand this issue first. >-- >Mike Kravetz