Re: [PATCH v2] mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06.05.24 22:41, Yosry Ahmed wrote:
On Mon, May 6, 2024 at 1:31 PM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 06.05.24 22:18, Yosry Ahmed wrote:
On Mon, May 06, 2024 at 09:50:10PM +0200, David Hildenbrand wrote:
On 06.05.24 21:29, Yosry Ahmed wrote:
Previously, all NR_VM_EVENT_ITEMS stats were maintained per-memcg,
although some of those fields are not exposed anywhere. Commit
14e0f6c957e39 ("memcg: reduce memory for the lruvec and memcg stats")
changed this such that we only maintain the stats we actually expose
per-memcg via a translation table.

Additionally, commit 514462bbe927b ("memcg: warn for unexpected events
and stats") added a warning if a per-memcg stat update is attempted for
a stat that is not in the translation table. The warning started firing
for the NR_{FILE/SHMEM}_PMDMAPPED stat updates in the rmap code. These
stats are not maintained per-memcg, and hence are not in the translation
table.

Do not use __lruvec_stat_mod_folio() when updating NR_FILE_PMDMAPPED and
NR_SHMEM_PMDMAPPED. Use __mod_node_page_state() instead, which updates
the global per-node stats only.

Reported-by: syzbot+9319a4268a640e26b72b@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://lore.kernel.org/lkml/0000000000001b9d500617c8b23c@xxxxxxxxxx
Fixes: 514462bbe927 ("memcg: warn for unexpected events and stats")
Acked-by: Shakeel Butt <shakeel.butt@xxxxxxxxx>
Signed-off-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
---
    mm/rmap.c | 15 +++++++++------
    1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 12be4241474ab..ed7f820369864 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1435,13 +1435,14 @@ static __always_inline void __folio_add_file_rmap(struct folio *folio,
             struct page *page, int nr_pages, struct vm_area_struct *vma,
             enum rmap_level level)
    {
+   pg_data_t *pgdat = folio_pgdat(folio);
     int nr, nr_pmdmapped = 0;
     VM_WARN_ON_FOLIO(folio_test_anon(folio), folio);
     nr = __folio_add_rmap(folio, page, nr_pages, level, &nr_pmdmapped);
     if (nr_pmdmapped)
-           __lruvec_stat_mod_folio(folio, folio_test_swapbacked(folio) ?
+           __mod_node_page_state(pgdat, folio_test_swapbacked(folio) ?
                     NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr_pmdmapped);
     if (nr)
             __lruvec_stat_mod_folio(folio, NR_FILE_MAPPED, nr);
@@ -1493,6 +1494,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio,
             enum rmap_level level)
    {
     atomic_t *mapped = &folio->_nr_pages_mapped;
+   pg_data_t *pgdat = folio_pgdat(folio);
     int last, nr = 0, nr_pmdmapped = 0;
     bool partially_mapped = false;
     enum node_stat_item idx;
@@ -1540,13 +1542,14 @@ static __always_inline void __folio_remove_rmap(struct folio *folio,
     }
     if (nr_pmdmapped) {
+           /* NR_{FILE/SHMEM}_PMDMAPPED are not maintained per-memcg */
             if (folio_test_anon(folio))
-                   idx = NR_ANON_THPS;
-           else if (folio_test_swapbacked(folio))
-                   idx = NR_SHMEM_PMDMAPPED;
+                   __lruvec_stat_mod_folio(folio, NR_ANON_THPS, -nr_pmdmapped);
             else
-                   idx = NR_FILE_PMDMAPPED;
-           __lruvec_stat_mod_folio(folio, idx, -nr_pmdmapped);
+                   __mod_node_page_state(pgdat,

folio_pgdat(folio) should fit here easily. :)

But I would actually suggest something like the following in mm/rmap.c

I am not a big fan of this. Not because I don't like the abstraction,
but because I think it doesn't go all the way. It abstracts a very
certain case: updating nr_pmdmapped for file folios.


Right. It only removes some of the ugliness ;)

I think if we do this we just add one unnecessary layer of indirection
to one case. If anything people will wonder why we have a helper only
for this case. Just my 2c :)


I think if we opt for abstracting the stats updates in mm/rmap.c, we
should go all the way with something like the following (probably split
as two patches: refactoring then bug fix). WDYT about the below?

diff --git a/mm/rmap.c b/mm/rmap.c
index 12be4241474ab..70d6f6309da01 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1269,6 +1269,28 @@ static void __page_check_anon_rmap(struct folio *folio, struct page *page,
                      page);
   }

+static void __foio_mod_stat(struct folio *folio, int nr, int nr_pmdmapped)
+{
+     int idx;
+
+     if (nr) {
+             idx = folio_test_anon(folio) ? NR_ANON_MAPPED : NR_FILE_MAPPED;
+             __lruvec_stat_mod_folio(folio, idx, nr);
+     }
+     if (nr_pmdmapped) {
+             if (folio_test_anon(folio)) {
+                     idx = NR_ANON_THPS;
+                     __lruvec_stat_mod_folio(folio, idx, nr_pmdmapped);
+             } else {
+                     /* NR_*_PMDMAPPED are not maintained per-memcg */
+                     idx = folio_test_swapbacked(folio) ?
+                             NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED;
+                     __mod_node_page_state(folio_pgdat(folio), idx,
+                                           nr_pmdmapped);
+             }
+     }
+}
+

I didn't suggest that, because in the _anon and _file functions we'll
end up introducing unnecessary folio_test_anon() checks that the
compiler cannot optimize out.

I convinced myself that the folio_test_anon() will be #free because
the struct folio should be already in the cache at this point, of
course I may be delusional :)

Well, nothing is free :) But yes, likely it won't matter much.


We can pass in an @anon boolean parameter, but it becomes an ugliness
tradeoff at this point :)

Agreed.


Anyway, I don't feel strongly either way. I am fine with keeping the
patch as-is, the diff I proposed above, or the diff I proposed with an
@anon parameter of folio_test_anon(). The only option I don't really
like is adding a helper just for the file pmdmapped case.

Let's go with your cleanup here (could do as separate cleanup on top of v2, whatever you prefer).

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux