On Thu, Aug 22, 2024 at 8:59 PM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 22.08.24 10:44, Barry Song wrote: > > On Thu, Aug 22, 2024 at 12:52 PM Barry Song <21cnbao@xxxxxxxxx> wrote: > >> > >> On Thu, Aug 22, 2024 at 5:34 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > >>> > >>> On 12.08.24 00:49, Barry Song wrote: > >>>> From: Barry Song <v-songbaohua@xxxxxxxx> > >>>> > >>>> Anon large folios come from three places: > >>>> 1. new allocated large folios in PF, they will call folio_add_new_anon_rmap() > >>>> for rmap; > >>>> 2. a large folio is split into multiple lower-order large folios; > >>>> 3. a large folio is migrated to a new large folio. > >>>> > >>>> In all above three counts, we increase nr_anon by 1; > >>>> > >>>> Anon large folios might go either because of be split or be put > >>>> to free, in these cases, we reduce the count by 1. > >>>> > >>>> Folios that have been added to the swap cache but have not yet received > >>>> an anon mapping won't be counted. This is consistent with the AnonPages > >>>> statistics in /proc/meminfo. > >>> > >>> Thinking out loud, I wonder if we want to have something like that for > >>> any anon folios (including small ones). > >>> > >>> Assume we longterm-pinned an anon folio and unmapped/zapped it. It would > >>> be quite interesting to see that these are actually anon pages still > >>> consuming memory. Same with memory leaks, when an anon folio doesn't get > >>> freed for some reason. > >>> > >>> The whole "AnonPages" counter thingy is just confusing, it only counts > >>> what's currently mapped ... so we'd want something different. > >>> > >>> But it's okay to start with large folios only, there we have a new > >>> interface without that legacy stuff :) > >> > >> We have two options to do this: > >> 1. add a new separate nr_anon_unmapped interface which > >> counts unmapped anon memory only > >> 2. let the nr_anon count both mapped and unmapped anon > >> folios. > >> > >> I would assume 1 is clearer as right now AnonPages have been > >> there for years. and counting all mapped and unmapped together, > >> we are still lacking an approach to find out anon memory leak > >> problem you mentioned. > >> > >> We are right now comparing nr_anon(including mapped folios only) > >> with AnonPages to get the distribution of different folio sizes in > >> performance profiling. > >> > >> unmapped_nr_anon should be normally always quite small. otherwise, > >> something must be wrong. > >> > >>> > >>>> > >>>> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx> > >>>> --- > >>>> Documentation/admin-guide/mm/transhuge.rst | 5 +++++ > >>>> include/linux/huge_mm.h | 15 +++++++++++++-- > >>>> mm/huge_memory.c | 13 ++++++++++--- > >>>> mm/migrate.c | 4 ++++ > >>>> mm/page_alloc.c | 5 ++++- > >>>> mm/rmap.c | 1 + > >>>> 6 files changed, 37 insertions(+), 6 deletions(-) > >>>> > >>>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > >>>> index 058485daf186..9fdfb46e4560 100644 > >>>> --- a/Documentation/admin-guide/mm/transhuge.rst > >>>> +++ b/Documentation/admin-guide/mm/transhuge.rst > >>>> @@ -527,6 +527,11 @@ split_deferred > >>>> it would free up some memory. Pages on split queue are going to > >>>> be split under memory pressure, if splitting is possible. > >>>> > >>>> +nr_anon > >>>> + the number of anon huge pages we have in the whole system. > >>> > >>> "transparent ..." otherwise people might confuse it with anon hugetlb > >>> "huge pages" ... :) > >>> > >>> I briefly tried coming up with a better name than "nr_anon" but failed. > >>> > >>> > >> > >> if we might have unmapped_anon counter later, maybe rename it to > >> nr_anon_mapped? and the new interface we will have in the future > >> might be nr_anon_unmapped? > > We really shouldn't be using the mapped/unmapped terminology here ... we > allocated pages and turned them into anonymous folios. At some point we > free them. That's the lifecycle. > > > > > On second thought, this might be incorrect as well. Concepts like 'anon', > > 'shmem', and 'file' refer to states after mapping. If an 'anon' has been > > unmapped but is still pinned and not yet freed, it isn't technically an > > 'anon' anymore? > > It's just not mapped, and cannot get mapped, anymore. In the memdesc > world, we'd be freeing the "struct anon" or "struct folio" once the last > refcount goes to 0, not once (e.g., temporarily during a failed > migration?) unmapped. > > The important part to me would be: this is memory that was allocated for > anonymous memory, and it's still around for some reason and not getting > freed. Usually, we would expect anon memory to get freed fairly quickly > once unmapped. Except when there are long-term pinnings or other types > of memory leaks. > > You could happily continue using these anon pages via vmsplice() or > similar, even thought he original page table mapping was torn down. > > > > > On the other hand, implementing nr_anon_unmapped could be extremely > > tricky. I have no idea how to implement it as we are losing those mapping > > flags. > > folio_mapcount() can tell you efficiently whether a folio is mapped or > not -- and that information will stay for all eternity as long as we > have any mapcounts :) . It cannot tell "how many" pages of a large folio > are mapped, but at least "is any page of this large folio mapped". Exactly. AnonPages decreases by -1 when removed from the rmap, whereas nr_anon decreases by -1 when an anon folio is freed. So, I would assume nr_anon includes those pinned and unmapped anon folios but AnonPages doesn't. If there's a significant amount of 'leaked' anon, we should consider having a separate counter for them. For instance, if nr_anon is 100,000 and pinned/unmapped pages account for 50%, then nr_anon alone doesn’t effectively reflect the system's state. to implement that, it seems we do need to detect the moment mapcount==0 and the moment of freeing anon? when mapcount==0 in rmap unmapped_pinned_anon++; when free unmapped_pinned_anon--; Anyway, it seems this is a separate job. > > > > > However, a page that is read-ahead but not yet mapped can still become > > an anon, which seems slightly less tricky to count though seems still > > difficult - except anon pages, shmem can be also swapped-backed? > > Yes. I'm sure there would be ways to achieve it, but I am not sure if > it's worth the churn. These pages can be reclaimed easily (I would > assume? They are not even mapped and were never accessible via GUP), and > can certainly not have any longterm pinnings or similar. There are more > like "cached things that could become an anon folio". Exactly. If no one maps the pages for an extended period, I assume the LRU will reclaim them as well. > > -- > Cheers, > > David / dhildenb Thanks Barry