On Fri, Nov 17, 2023 at 8:46 PM Zhongkun He <hezhongkun.hzk@xxxxxxxxxxxxx> wrote: > > Hi Chris, thanks for your time. > > > > > On Fri, Nov 17, 2023 at 1:56 AM Zhongkun He > > <hezhongkun.hzk@xxxxxxxxxxxxx> wrote: > > > Hi Chris, thanks for your feedback. I have the same concerns, > > > maybe we should just move the zswap_invalidate() out of batches, > > > as Yosry mentioned above. > > > > As I replied in the previous email, I just want to understand the > > other side effects of the change better. > > > > To me, this patching is actually freeing the memory that does not > > require actual page IO write from zswap. Which means the memory is > > from some kind of cache. It would be interesting if we can not > > complicate the write back path further. Instead, we can drop those > > memories from the different cache if needed. I assume those caches are > > doing something useful in the common case. If not, we should have a > > patch to remove these caches instead. Not sure how big a mess it will > > be to implement separate the write and drop caches. > > > > While you are here, I have some questions for you. > > > > Can you help me understand how much memory you can free from this > > patch? For example, are we talking about a few pages or a few GB? > > > > Where does the freed memory come from? > > If the memory comes from zswap entry struct. Due to the slab allocator > > fragmentation. It would take a lot of zswap entries to have meaningful > > memory reclaimed from the slab allocator. > > > > If the memory comes from the swap cached pages, that would be much > > more meaningful. But that is not what this patch is doing, right? > > > > Chris > > It's my bad for putting two cases together. The memory released in both > cases comes from zswap entry struct and zswap compressed page. > > The original intention of this patch is to solve the problem that > shrink_work() fails to reclaim memory in two situations. > > For case (1), the zswap_writeback_entry() will failed for the > __read_swap_cache_async return NULL because the swap has been > freed but cached in swap_slots_cache, so the memory come from > the zswap entry struct and compressed page. > Count = SWAP_BATCH * ncpu. > Solution: move the zswap_invalidate() out of batches, free it once the swap > count equal to 0. > > For case (2), the zswap_writeback_entry() will failed for !page_was_allocated > because zswap_load will have two copies of the same page in memory > (compressed and uncompressed) after faulting in a page from zswap when > zswap_exclusive_loads disabled. The amount of memory is greater but depends > on the usage. > > Why do we need to release them? > Consider this scenario,there is a lot of data cached in memory and zswap, > hit the limit,and shrink_worker will fail. The new coming data will be written > directly to swap due to zswap_store failure. Should we free the last one > to store the latest one in zswap. Shameless plug: zswap will much less likely hit the limit (global or cgroup) with the shrinker enabled ;) It will proactively reclaim the objects way ahead of the limit. It comes with its own can of worms, of course - it's unlikely to work for all workloads in its current form, but perhaps worth experimenting with/improved upon? > > According to the previous discussion, the writeback is inevitable. > So I want to make zswap_exclusive_loads_enabled the default behavior > or make it the only way to do zswap loads. It only makes sense when > the page is read and no longer dirty. If the page is read frequently, it > should stay in cache rather than zswap. The benefit of doing this is > very small, i.e. two copies of the same page in memory. > > Thanks again.