On 17/07/2024 09:44, David Hildenbrand wrote: >>>> I guess the real supported orders are: >>>> >>>> anon: >>>> min order: 2 >>>> max order: PMD_ORDER >>>> anon-shmem: >>>> min order: 1 >>>> max order: MAX_PAGECACHE_ORDER >>>> tmpfs-shmem: >>>> min order: PMD_ORDER <= 11 ? PMD_ORDER : NONE >>>> max order: PMD_ORDER <= 11 ? PMD_ORDER : NONE >>>> file: >>>> min order: 1 >>>> max order: MAX_PAGECACHE_ORDER >>> >>> That's my understanding. But not sure about anon-shmem really supporting >>> order-1, maybe we do. >> >> Oh, I thought we only had the restriction for anon folios now (due to deferred >> split queue), so assumed it would just work. With Gavin's >> THP_ORDERS_ALL_FILE_DEFAULT change, that certainly implies that shmem must >> support order-1. If it doesn't then we we might want to tidy that further. >> >> Baolin, perhaps you can confirm either way? > > Currently there would not have been a way to enable it, right? (maybe I'm wrong) __thp_vma_allowable_orders() doesn't do anything special for shmem if TVA_IN_PF is set, so I guess it could concievably return order-1 in that path. Not sure if it ever gets called that way for shmem though - I don't think so. But agree that shmem_allowable_huge_orders() will not currently return order-1. > >> >>> >>>> >>>> But today, controls and stats are exposed for: >>>> >>>> anon: >>>> min order: 2 >>>> max order: PMD_ORDER >>>> anon-shmem: >>>> min order: 2 >>>> max order: PMD_ORDER >>>> tmpfs-shmem: >>>> min order: PMD_ORDER >>>> max order: PMD_ORDER >>>> file: >>>> min order: Nothing yet (this patch proposes 1) >>>> max order: Nothing yet (this patch proposes MAX_PAGECACHE_ORDER) >>>> >>>> So I think there is definitely a bug for shmem where the minimum order control >>>> should be order-1 but its currently order-2. >>> >>> Maybe, did not play with that yet. Likely order-1 will work. (although probably >>> of questionable use :) ) >> >> You might have to expand on why its of "questionable use". I'd assume it has the >> same amount of value as using order-1 for regular page cache pages? i.e. half >> the number of objects to manage for the same amount of memory. > > order-1 was recently added for the pagecache to get some device setups running > (IIRC, where we cannot use order-0, because device blocksize > PAGE_SIZE). > > You might be right about "half the number of objects", but likely just going for > order-2, order-3, order-4 ... for shmem might be even better. And simply falling > back to order-0 when you cannot get the larger orders. Sure, but then you're into the territory of baking in policy. Remember that originally I was only interested in 64K but the concensus was to expose all the sizes. Same argument applies to 8K; expose it and let others decide policy. > > I could have sworn you mentioned something like that in your "configurable > orders for pagecache" RFC that I only briefly skimmed so far :P I'm exposing the 8K control for pagecache in that series. > > ... only enabling "order-1" and none of the other orders for shmem sounds rather > "interesting". > > But yeah, maybe there is valid use for it, so I'm all for allowing it if it can > be done. > >> >>> >>>> >>>> I also wonder about PUD-order for DAX? We don't currently have a stat/control. >>>> If we wanted to add it in future, if we take the "expose all stats/controls for >>>> all orders" approach, we would end up extending all the way to PUD-order and >>>> all >>>> the orders between PMD and PUD would be dummy for all memory types. That really >>>> starts to feel odd, so I still favour only populating what's really supported. >>> >>> I would go further and say that calling the fsdax thing a THP is borderline >>> wrong and we should not expose any new toggles for it that way. >>> >>> It really behaves much more like hugetlb folios that can be PTE-mapped ... we >>> cannot split these things, and they are not allocated from the buddy. So I >>> wouldn't worry about fsdax for now. >>> >>> fsdax support for compound pages (now large folios) probably never should have >>> been glued to any THP toggle. >> >> Yeah fair enough. I wasn't really arguing for adding any dax controls; I was >> just trying to think of examples as to why adding dummy controls might be a bad >> idea. > > Yes. > >>> >>>> >>>> I propose to fix shmem (extend down to 1, stop at MAX_PAGECACHE_ORDER) and >>>> continue with the approach of "indicating only what really exists" for v2. >>>> >>>> Shout if you disagree. >>> >>> Makes sense. >> >> Excellent. I posted v2, which has these changes, yesterday afternoon. :) > > Yes, still digging through mails ... in-between having roughly 1000 meetings a > day :) No problem. You're in-demand. I can wait. :)