On Tue, Jul 9, 2024 at 7:55 PM Ryan Roberts <ryan.roberts@xxxxxxx> wrote: > > On 09/07/2024 02:44, Barry Song wrote: > > On Tue, Jul 9, 2024 at 12:30 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote: > >> > >> On 08/07/2024 12:36, Barry Song wrote: > >>> On Mon, Jul 8, 2024 at 11:24 PM Ryan Roberts <ryan.roberts@xxxxxxx> wrote: > >>>> > >>>> The legacy PMD-sized THP counters at /proc/vmstat include > >>>> thp_file_alloc, thp_file_fallback and thp_file_fallback_charge, which > >>>> rather confusingly refer to shmem THP and do not include any other types > >>>> of file pages. This is inconsistent since in most other places in the > >>>> kernel, THP counters are explicitly separated for anon, shmem and file > >>>> flavours. However, we are stuck with it since it constitutes a user ABI. > >>>> > >>>> Recently, commit 66f44583f9b6 ("mm: shmem: add mTHP counters for > >>>> anonymous shmem") added equivalent mTHP stats for shmem, keeping the > >>>> same "file_" prefix in the names. But in future, we may want to add > >>>> extra stats to cover actual file pages, at which point, it would all > >>>> become very confusing. > >>>> > >>>> So let's take the opportunity to rename these new counters "shmem_" > >>>> before the change makes it upstream and the ABI becomes immutable. > >>> > >>> Personally, I think this approach is much clearer. However, I recall > >>> we discussed this > >>> before [1], and it seems that inconsistency is a concern? > >> > >> Embarrassingly, I don't recall that converstation at all :-| but at least what I > >> said then is consistent with what I've done in this patch. > >> > >> I think David's conclusion from that thread was to call them FILE_, and add both > >> shmem and pagecache counts to those counters, meaning we can keep the same name > >> as legacy THP counters. But those legacy THP counters only count shmem, and I > >> don't think we would get away with adding pagecache counts to those at this > >> point? (argument: they have been around for long time and there is a risk that > >> user space relies on them and if they were to dramatically increase due to > >> pagecache addition now that could break things). In that case, there is still > >> inconsistency, but its worse; the names are consistent but the semantics are > >> inconsistent. > >> > >> So my vote is to change to SHMEM_ as per this patch :) > > > > I have no objections. However, I dislike the documentation for > > thp_file_*. Perhaps we can clean it all up together ? > > I agree that we should clean this documentation up and I'm happy to roll it into > v2. However, I don't think what you have suggested is quite right. > > thp_file_alloc, thp_file_fallback and thp_file_fallback_charge *only* count > shmem. They don't count pagecache. So perhaps the change should be "...every > time a shmem huge page (dispite being named after "file", the counter measures > only shmem) is..."? I understand what you are saying, and I know that thp_file_* has only included shmem so far. My question is whether it will include regular files in the future? If not, I am perfectly fine with your approach. READ_ONLY_THP_FOR_FS isn't applicable in this path as it is created by khugepaged collapse. > > thp_file_mapped includes both file and shmem, so agree with your change there. > > What do you think? > > > > > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > > index 709fe10b60f4..65df48cb3bbb 100644 > > --- a/Documentation/admin-guide/mm/transhuge.rst > > +++ b/Documentation/admin-guide/mm/transhuge.rst > > @@ -417,21 +417,22 @@ thp_collapse_alloc_failed > > the allocation. > > > > thp_file_alloc > > - is incremented every time a file huge page is successfully > > - allocated. > > + is incremented every time a file (including shmem) huge page is > > + successfully allocated. > > > > thp_file_fallback > > - is incremented if a file huge page is attempted to be allocated > > - but fails and instead falls back to using small pages. > > + is incremented if a file (including shmem) huge page is attempted > > + to be allocated but fails and instead falls back to using small > > + pages. > > > > thp_file_fallback_charge > > - is incremented if a file huge page cannot be charged and instead > > - falls back to using small pages even though the allocation was > > - successful. > > + is incremented if a file (including shmem) huge page cannot be > > + charged and instead falls back to using small pages even though > > + the allocation was successful. > > > > thp_file_mapped > > - is incremented every time a file huge page is mapped into > > - user address space. > > + is incremented every time a file (including shmem) huge page is > > + mapped into user address space. > > > > thp_split_page > > is incremented every time a huge page is split into base > > > >> > >>> > >>> [1] https://lore.kernel.org/linux-mm/05d0096e4ec3e572d1d52d33a31a661321ac1551.1713755580.git.baolin.wang@xxxxxxxxxxxxxxxxx/ > >>> > >>> > >>>> > >>>> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx> > >>>> --- > >>>> > >>>> Hi All, > >>>> > >>>> Applies on top of today's mm-unstable (2073cda629a4) and tested with mm > >>>> selftests; no regressions observed. > >>>> > >>>> The backstory here is that I'd like to introduce some counters for regular file > >>>> folio allocations to observe how often large folio allocation succeeds, but > >>>> these shmem counters are named "file" which is going to make things confusing. > >>>> So hoping to solve that before commit 66f44583f9b6 ("mm: shmem: add mTHP > >>>> counters for anonymous shmem") goes upstream (it is currently in mm-stable). > >>>> > >>>> Admittedly, this change means the mTHP stat names are not the same as the legacy > >>>> PMD-size THP names, but I think that's a smaller issue than having "file_" mTHP > >>>> stats that only count shmem, then having to introduce "file2_" or "pgcache_" > >>>> stats for the regular file memory, which is even more inconsistent IMHO. I guess > >>>> the alternative is to count both shmem and file in these mTHP stats (that's how > >>>> they were documented anyway) but I think it's better to be able to consider them > >>>> separately like we do for all the other counters. > >>>> > >>>> Thanks, > >>>> Ryan > >>>> > >>>> Documentation/admin-guide/mm/transhuge.rst | 12 ++++++------ > >>>> include/linux/huge_mm.h | 6 +++--- > >>>> mm/huge_memory.c | 12 ++++++------ > >>>> mm/shmem.c | 8 ++++---- > >>>> 4 files changed, 19 insertions(+), 19 deletions(-) > >>>> > >>>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > >>>> index 747c811ee8f1..8b891689fc13 100644 > >>>> --- a/Documentation/admin-guide/mm/transhuge.rst > >>>> +++ b/Documentation/admin-guide/mm/transhuge.rst > >>>> @@ -496,16 +496,16 @@ swpout_fallback > >>>> Usually because failed to allocate some continuous swap space > >>>> for the huge page. > >>>> > >>>> -file_alloc > >>>> - is incremented every time a file huge page is successfully > >>>> +shmem_alloc > >>>> + is incremented every time a shmem huge page is successfully > >>>> allocated. > >>>> > >>>> -file_fallback > >>>> - is incremented if a file huge page is attempted to be allocated > >>>> +shmem_fallback > >>>> + is incremented if a shmem huge page is attempted to be allocated > >>>> but fails and instead falls back to using small pages. > >>>> > >>>> -file_fallback_charge > >>>> - is incremented if a file huge page cannot be charged and instead > >>>> +shmem_fallback_charge > >>>> + is incremented if a shmem huge page cannot be charged and instead > >>>> falls back to using small pages even though the allocation was > >>>> successful. > >>>> > >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > >>>> index acb6ac24a07e..cff002be83eb 100644 > >>>> --- a/include/linux/huge_mm.h > >>>> +++ b/include/linux/huge_mm.h > >>>> @@ -269,9 +269,9 @@ enum mthp_stat_item { > >>>> MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE, > >>>> MTHP_STAT_SWPOUT, > >>>> MTHP_STAT_SWPOUT_FALLBACK, > >>>> - MTHP_STAT_FILE_ALLOC, > >>>> - MTHP_STAT_FILE_FALLBACK, > >>>> - MTHP_STAT_FILE_FALLBACK_CHARGE, > >>>> + MTHP_STAT_SHMEM_ALLOC, > >>>> + MTHP_STAT_SHMEM_FALLBACK, > >>>> + MTHP_STAT_SHMEM_FALLBACK_CHARGE, > >>>> MTHP_STAT_SPLIT, > >>>> MTHP_STAT_SPLIT_FAILED, > >>>> MTHP_STAT_SPLIT_DEFERRED, > >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >>>> index 9ec64aa2be94..f9696c94e211 100644 > >>>> --- a/mm/huge_memory.c > >>>> +++ b/mm/huge_memory.c > >>>> @@ -568,9 +568,9 @@ DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK); > >>>> DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); > >>>> DEFINE_MTHP_STAT_ATTR(swpout, MTHP_STAT_SWPOUT); > >>>> DEFINE_MTHP_STAT_ATTR(swpout_fallback, MTHP_STAT_SWPOUT_FALLBACK); > >>>> -DEFINE_MTHP_STAT_ATTR(file_alloc, MTHP_STAT_FILE_ALLOC); > >>>> -DEFINE_MTHP_STAT_ATTR(file_fallback, MTHP_STAT_FILE_FALLBACK); > >>>> -DEFINE_MTHP_STAT_ATTR(file_fallback_charge, MTHP_STAT_FILE_FALLBACK_CHARGE); > >>>> +DEFINE_MTHP_STAT_ATTR(shmem_alloc, MTHP_STAT_SHMEM_ALLOC); > >>>> +DEFINE_MTHP_STAT_ATTR(shmem_fallback, MTHP_STAT_SHMEM_FALLBACK); > >>>> +DEFINE_MTHP_STAT_ATTR(shmem_fallback_charge, MTHP_STAT_SHMEM_FALLBACK_CHARGE); > >>>> DEFINE_MTHP_STAT_ATTR(split, MTHP_STAT_SPLIT); > >>>> DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLIT_FAILED); > >>>> DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED); > >>>> @@ -581,9 +581,9 @@ static struct attribute *stats_attrs[] = { > >>>> &anon_fault_fallback_charge_attr.attr, > >>>> &swpout_attr.attr, > >>>> &swpout_fallback_attr.attr, > >>>> - &file_alloc_attr.attr, > >>>> - &file_fallback_attr.attr, > >>>> - &file_fallback_charge_attr.attr, > >>>> + &shmem_alloc_attr.attr, > >>>> + &shmem_fallback_attr.attr, > >>>> + &shmem_fallback_charge_attr.attr, > >>>> &split_attr.attr, > >>>> &split_failed_attr.attr, > >>>> &split_deferred_attr.attr, > >>>> diff --git a/mm/shmem.c b/mm/shmem.c > >>>> index 921d59c3d669..f24dfbd387ba 100644 > >>>> --- a/mm/shmem.c > >>>> +++ b/mm/shmem.c > >>>> @@ -1777,7 +1777,7 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, > >>>> if (pages == HPAGE_PMD_NR) > >>>> count_vm_event(THP_FILE_FALLBACK); > >>>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE > >>>> - count_mthp_stat(order, MTHP_STAT_FILE_FALLBACK); > >>>> + count_mthp_stat(order, MTHP_STAT_SHMEM_FALLBACK); > >>>> #endif > >>>> order = next_order(&suitable_orders, order); > >>>> } > >>>> @@ -1804,8 +1804,8 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, > >>>> count_vm_event(THP_FILE_FALLBACK_CHARGE); > >>>> } > >>>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE > >>>> - count_mthp_stat(folio_order(folio), MTHP_STAT_FILE_FALLBACK); > >>>> - count_mthp_stat(folio_order(folio), MTHP_STAT_FILE_FALLBACK_CHARGE); > >>>> + count_mthp_stat(folio_order(folio), MTHP_STAT_SHMEM_FALLBACK); > >>>> + count_mthp_stat(folio_order(folio), MTHP_STAT_SHMEM_FALLBACK_CHARGE); > >>>> #endif > >>>> } > >>>> goto unlock; > >>>> @@ -2181,7 +2181,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > >>>> if (folio_test_pmd_mappable(folio)) > >>>> count_vm_event(THP_FILE_ALLOC); > >>>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE > >>>> - count_mthp_stat(folio_order(folio), MTHP_STAT_FILE_ALLOC); > >>>> + count_mthp_stat(folio_order(folio), MTHP_STAT_SHMEM_ALLOC); > >>>> #endif > >>>> goto alloced; > >>>> } > >>>> -- > >>>> 2.43.0 > >>>> > >>> Thanks Barry