Re: [PATCH v2] mm: count zeromap read and set for swapout and swapin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02.11.24 13:59, Barry Song wrote:
On Sat, Nov 2, 2024 at 8:32 PM Usama Arif <usamaarif642@xxxxxxxxx> wrote:



On 02/11/2024 10:12, Barry Song wrote:
From: Barry Song <v-songbaohua@xxxxxxxx>

When the proportion of folios from the zero map is small, missing their
accounting may not significantly impact profiling. However, it’s easy
to construct a scenario where this becomes an issue—for example,
allocating 1 GB of memory, writing zeros from userspace, followed by
MADV_PAGEOUT, and then swapping it back in. In this case, the swap-out
and swap-in counts seem to vanish into a black hole, potentially
causing semantic ambiguity.

We have two ways to address this:

1. Add a separate counter specifically for the zero map.
2. Continue using the current accounting, treating the zero map like
a normal backend. (This aligns with the current behavior of zRAM
when supporting same-page fills at the device level.)

This patch adopts option 1 as pswpin/pswpout counters are that they
only apply to IO done directly to the backend device (as noted by
Nhat Pham).

We can find these counters from /proc/vmstat (counters for the whole
system) and memcg's memory.stat (counters for the interested memcg).

For example:

$ grep -E 'swpin_zero|swpout_zero' /proc/vmstat
swpin_zero 1648
swpout_zero 33536

$ grep -E 'swpin_zero|swpout_zero' /sys/fs/cgroup/system.slice/memory.stat
swpin_zero 3905
swpout_zero 3985

Fixes: 0ca0c24e3211 ("mm: store zero pages to be swapped out in a bitmap")
I don't think its a hotfix (or even a fix). It was discussed in the initial
series to add these as a follow up and Joshua was going to do this soon.
Its not fixing any bug in the initial series.

I would prefer that all kernel versions with zeromap include this
counter; otherwise,
it could be confusing to determine where swap-in and swap-out have occurred,
as shown by the small program below:

p =malloc(1g);
write p to zero
madvise_pageout
read p;

Previously, there was 1GB of swap-in and swap-out activity reported, but
now nothing is shown.

I don't mean to suggest that there's a bug in the zeromap code; rather,
having this counter would help clear up any confusion.

I didn't realize Joshua was handling it. Is he still planning to? If
so, I can leave it
with Joshua if that was the plan :-)


Cc: Usama Arif <usamaarif642@xxxxxxxxx>
Cc: Chengming Zhou <chengming.zhou@xxxxxxxxx>
Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
Cc: Nhat Pham <nphamcs@xxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: David Hildenbrand <david@xxxxxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx>
Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
Cc: Chris Li <chrisl@xxxxxxxxxx>
Cc: "Huang, Ying" <ying.huang@xxxxxxxxx>
Cc: Kairui Song <kasong@xxxxxxxxxxx>
Cc: Ryan Roberts <ryan.roberts@xxxxxxx>
Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx>
---
  -v2:
  * add separate counters rather than using pswpin/out; thanks
  for the comments from Usama, David, Yosry and Nhat;
  * Usama also suggested a new counter like swapped_zero, I
  prefer that one be separated as an enhancement patch not
  a hotfix. will probably handle it later on.

I dont think either of them would be a hotfix.

As mentioned above, this isn't about fixing a bug; it's simply to ensure
that swap-related metrics don't disappear.

Documentation/process/submitting-patches.rst:

"A Fixes: tag indicates that the patch fixes an issue in a previous commit. It is used to make it easy to determine where a bug originated, which can help review a bug fix."

If there is no BUG, I'm afraid you are abusing that tag.

Also, I don't really understand the problem? We added an optimization, great. Who will be complaining about that?

Above you write "it could be confusing to determine where swap-in and swap-out have occurred" -- when is that confusion supposed to happen in practice? How will the confused individuals know that they must take a look at that new metric, even if it is in place?

I think we should just add the new stats and call it a day.

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux