On 05/07/2024 09:57, Baolin Wang wrote: > > > On 2024/7/5 16:42, Ryan Roberts wrote: >> On 05/07/2024 04:01, Baolin Wang wrote: >>> >>> >>> On 2024/7/4 22:46, Bang Li wrote: >>>> Hi Bao lin, >>>> >>>> On 2024/7/4 19:15, Baolin Wang wrote: >>>>> >>>>>>> + >>>>>>> + /* >>>>>>> + * Only allow inherit orders if the top-level value is 'force', which >>>>>>> + * means non-PMD sized THP can not override 'huge' mount option now. >>>>>>> + */ >>>>>>> + if (shmem_huge == SHMEM_HUGE_FORCE) >>>>>>> + return READ_ONCE(huge_shmem_orders_inherit); >>>>>> >>>>>> I vaguely recall that we originally discussed that trying to set 'force' >>>>>> on the >>>>>> top level control while any per-size controls were set to 'inherit' would >>>>>> be an >>>>>> error, and trying to set 'force' on any per-size control except the PMD-size >>>>>> would be an error? >>>>> >>>>> Right. >>>>> >>>>>> I don't really understand this logic. Shouldn't we just be looking at the >>>>>> per-size control settings (or the top-level control as a proxy for every >>>>>> per-size control that has 'inherit' set)? >>>>> >>>>> ‘force’ will apply the huge orders for anon shmem and tmpfs, so now we only >>>>> allow pmd-mapped THP to be forced. We should not look at per-size control >>>>> settings for tmpfs now (mTHP for tmpfs will be discussed in future). >>>>> >>>>>> >>>>>> Then for tmpfs, which doesn't support non-PMD-sizes yet, we just always >>>>>> use the >>>>>> PMD-size control for decisions. >>>>>> >>>>>> I'm also really struggling with the concept of shmem_is_huge() existing along >>>>>> side shmem_allowable_huge_orders(). Surely this needs to all be refactored >>>>>> into >>>>>> shmem_allowable_huge_orders()? >>>>> >>>>> I understood. But now they serve different purposes: shmem_is_huge() will be >>>>> used to check the huge orders for the top level, for *tmpfs* and anon shmem; >>>>> whereas shmem_allowable_huge_orders() will only be used to check the per-size >>>>> huge orders for anon shmem (excluding tmpfs now). However, as I plan to add >>>>> mTHP support for tmpfs, I think we can perform some cleanups. >>>> >>>> Please count me in, I'd be happy to contribute to the cleanup and enhancement >>>> process if I can. >>> >>> Good. If you have time, I think you can look at the shmem khugepaged issue from >>> the previous discussion [1], which I don't have time to look at now. >>> >>> " >>> (3) khugepaged >>> >>> khugepaged needs to handle larger folios properly as well. Until fixed, >>> using smaller THP sizes as fallback might prohibit collapsing a >>> PMD-sized THP later. But really, khugepaged needs to be fixed to handle >>> that. >>> " >> >> khugepaged can already collapse "folios of any order less then PMD-size" to >> PMD-size, for anon memory. Infact I modified the selftest to be able to test >> that in commit 9f0704eae8a4 ("selftests/mm/khugepaged: enlighten for multi-size >> THP"). I'd be surprised if khugepaged can't alreay handle the same for shmem? > > I did not test this, but from the comment in hpage_collapse_scan_file(), seems > that compacting small mTHP into a single PMD-mapped THP is not supported yet. > > /* > * TODO: khugepaged should compact smaller compound pages > * into a PMD sized page > */ > if (folio_test_large(folio)) { > result = folio_order(folio) == HPAGE_PMD_ORDER && > folio->index == start > /* Maybe PMD-mapped */ > ? SCAN_PTE_MAPPED_HUGEPAGE > : SCAN_PAGE_COMPOUND; > /* > * For SCAN_PTE_MAPPED_HUGEPAGE, further processing > * by the caller won't touch the page cache, and so > * it's safe to skip LRU and refcount checks before > * returning. > */ > break; > } OK, so the functionality is missing just for shmem/file-backed collapse. Got it. Sorry for the noise. > >> Although the test will definitely want to be extended to test it. > > Right.