On 28/10/2024 20:42, Barry Song wrote: > On Tue, Oct 29, 2024 at 4:00 AM Usama Arif <usamaarif642@xxxxxxxxx> wrote: >> >> >> >> On 28/10/2024 19:54, Barry Song wrote: >>> On Tue, Oct 29, 2024 at 1:20 AM Usama Arif <usamaarif642@xxxxxxxxx> wrote: >>>> >>>> >>>> >>>> On 28/10/2024 17:08, Yosry Ahmed wrote: >>>>> On Mon, Oct 28, 2024 at 10:00 AM Usama Arif <usamaarif642@xxxxxxxxx> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 28/10/2024 16:33, Nhat Pham wrote: >>>>>>> On Mon, Oct 28, 2024 at 5:23 AM Usama Arif <usamaarif642@xxxxxxxxx> wrote: >>>>>>>> >>>>>>>> I wonder if instead of having counters, it might be better to keep track >>>>>>>> of the number of zeropages currently stored in zeromap, similar to how >>>>>>>> zswap_same_filled_pages did it. It will be more complicated then this >>>>>>>> patch, but would give more insight of the current state of the system. >>>>>>>> >>>>>>>> Joshua (in CC) was going to have a look at that. >>>>>>> >>>>>>> I don't think one can substitute for the other. >>>>>> >>>>>> Yes agreed, they have separate uses and provide different information, but >>>>>> maybe wasteful to have both types of counters? They are counters so maybe >>>>>> dont consume too much resources but I think we should still think about >>>>>> it.. >>>>> >>>>> Not for or against here, but I would say that statement is debatable >>>>> at best for memcg stats :) >>>>> >>>>> Each new counter consumes 2 longs per-memcg per-CPU (see >>>>> memcg_vmstats_percpu), about 16 bytes, which is not a lot but it can >>>>> quickly add up with a large number of CPUs/memcgs/stats. >>>>> >>>>> Also, when flushing the stats we iterate all of them to propagate >>>>> updates from per-CPU counters. This is already a slowpath so adding >>>>> one stat is not a big deal, but again because we iterate all stats on >>>>> multiple CPUs (and sometimes on each node as well), the overall flush >>>>> latency becomes a concern sometimes. >>>>> >>>>> All of that is not to say we shouldn't add more memcg stats, but we >>>>> have to be mindful of the resources. >>>> >>>> Yes agreed! Plus the cost of incrementing similar counters (which ofcourse is >>>> also not much). >>>> >>>> Not trying to block this patch in anyway. Just think its a good point >>>> to discuss here if we are ok with both types of counters. If its too wasteful >>>> then which one we should have. >>> >>> Hi Usama, >>> my point is that with all the below three counters: >>> 1. PSWPIN/PSWPOUT >>> 2. ZSWPIN/ZSWPOUT >>> 3. SWAPIN_SKIP/SWAPOUT_SKIP or (ZEROSWPIN, ZEROSWPOUT what ever) >>> >>> Shouldn't we have been able to determine the portion of zeromap >>> swap indirectly? >>> >> >> Hmm, I might be wrong, but I would have thought no? >> >> What if you swapout a zero folio, but then discard it? >> zeromap_swpout would be incremented, but zeromap_swapin would not. > > I understand. It looks like we have two issues to tackle: > 1. We shouldn't let zeromap swap in or out anything that vanishes into > a black hole > 2. We want to find out how much I/O/memory has been saved due to zeromap so far > > From my perspective, issue 1 requires a "fix", while issue 2 is more > of an optimization. Hmm I dont understand why point 1 would be an issue. If its discarded thats fine as far as I can see. As a reference, memory.stat.zswapped != memory.stat.zswapout - memory.stat.zswapin. Because zswapped would take into account swapped out anon memory freed, MADV_FREE, shmem truncate, etc as Yosry said about zeromap, But zswapout and zswapin dont. > > I consider issue 1 to be more critical because, after observing a phone > running for some time, I've been able to roughly estimate the portion > zeromap can > help save using only PSWPOUT, ZSWPOUT, and SWAPOUT_SKIP, even without a > SWPIN counter. However, I agree that issue 2 still holds significant value > as a separate patch. > > Thanks > Barry