On Tue, 2025-03-11 at 22:33 +1300, Barry Song wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > > > On Tue, Mar 11, 2025 at 5:58 PM Sergey Senozhatsky > <senozhatsky@xxxxxxxxxxxx> wrote: > > > > On (25/03/08 18:41), Barry Song wrote: > > > On Sat, Mar 8, 2025 at 12:03 PM Nhat Pham <nphamcs@xxxxxxxxx> > > > wrote: > > > > > > > > On Fri, Mar 7, 2025 at 4:02 AM Qun-Wei Lin > > > > <qun-wei.lin@xxxxxxxxxxxx> wrote: > > > > > > > > > > This patch series introduces a new mechanism called > > > > > kcompressd to > > > > > improve the efficiency of memory reclaiming in the operating > > > > > system. The > > > > > main goal is to separate the tasks of page scanning and page > > > > > compression > > > > > into distinct processes or threads, thereby reducing the load > > > > > on the > > > > > kswapd thread and enhancing overall system performance under > > > > > high memory > > > > > pressure conditions. > > > > > > > > Please excuse my ignorance, but from your cover letter I still > > > > don't > > > > quite get what is the problem here? And how would decouple > > > > compression > > > > and scanning help? > > > > > > My understanding is as follows: > > > > > > When kswapd attempts to reclaim M anonymous folios and N file > > > folios, > > > the process involves the following steps: > > > > > > * t1: Time to scan and unmap anonymous folios > > > * t2: Time to compress anonymous folios > > > * t3: Time to reclaim file folios > > > > > > Currently, these steps are executed sequentially, meaning the > > > total time > > > required to reclaim M + N folios is t1 + t2 + t3. > > > > > > However, Qun-Wei's patch enables t1 + t3 and t2 to run in > > > parallel, > > > reducing the total time to max(t1 + t3, t2). This likely improves > > > the > > > reclamation speed, potentially reducing allocation stalls. > > > > If compression kthread-s can run (have CPUs to be scheduled on). > > This looks a bit like a bottleneck. Is there anything that > > guarantees forward progress? Also, if compression kthreads > > constantly preempt kswapd, then it might not be worth it to > > have compression kthreads, I assume? > > Thanks for your critical insights, all of which are valuable. > > Qun-Wei is likely working on an Android case where the CPU is > relatively idle in many scenarios (though there are certainly cases > where all CPUs are busy), but free memory is quite limited. > We may soon see benefits for these types of use cases. I expect > Android might have the opportunity to adopt it before it's fully > ready upstream. > > If the workload keeps all CPUs busy, I suppose this async thread > won’t help, but at least we might find a way to mitigate regression. > > We likely need to collect more data on various scenarios—when > CPUs are relatively idle and when all CPUs are busy—and > determine the proper approach based on the data, which we > currently lack :-) > Thanks for the explaining! > > > > If we have a pagefault and need to map a page that is still in > > the compression queue (not compressed and stored in zram yet, e.g. > > dut to scheduling latency + slow compression algorithm) then what > > happens? > > This is happening now even without the patch? Right now we are > having 4 steps: > 1. add_to_swap: The folio is added to the swapcache. > 2. try_to_unmap: PTEs are converted to swap entries. > 3. pageout: The folio is written back. > 4. Swapcache is cleared. > > If a swap-in occurs between 2 and 4, doesn't that mean > we've already encountered the case where we hit > the swapcache for a folio undergoing compression? > > It seems we might have an opportunity to terminate > compression if the request is still in the queue and > compression hasn’t started for a folio yet? seems > quite difficult to do? As Barry explained, these folios that are being compressed are in the swapcache. If a refault occurs during the compression process, its correctness is already guaranteed by the swap subsystem (similar to other asynchronous swap devices). Indeed, terminating a folio that is already in the queue waiting for compression is a challenging task. Will this require some modifications to the current architecture of swap subsystem? > > Thanks > Barry Best Regards, Qun-wei