On Sun, Oct 8, 2023 at 9:17 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > > Jianlin Lv <iecedge@xxxxxxxxx> writes: > > > From: Jianlin Lv <iecedge@xxxxxxxxx> > > > > Global reclaim will swap even if swappiness is set to 0. > > Why? Can you elaborate the situation? We reproduced the issue of pages being swapped out even when swappiness is set to 0 in the production environment through the following test program. Not sure whether this program can reproduce the issue in any environment. >From the implementation of the get_scan_count code, it can be seen that, based on the current runtime situation, memory reclamation will choose a scanning method (SCAN_ANON/SCAN_FILE/SCAN_FRACT) to determine how aggressively the anon and file LRU are scanned. However, this introduces uncertainty. For the JVM issue at hand, we expect deterministic SCAN_FILE scan to avoid swapping out anon pages. code:: #!/usr/bin/env python import mmap import os import sys def write_files(): count = 1 if not os.path.isdir(WRITE_DIR): os.mkdir(WRITE_DIR) while True: _, i = divmod(count, 6000) file = "{}/{}_{}.txt".format(WRITE_DIR, WRITE_FILE, i) with open(file, 'w') as f: # Write 100 MB to a file num_chars = 100 * 1024 * 1024 f.write('0' * num_chars) count = count + 1 def create_read_file(): with open(READ_FILE, 'wb') as f: num_chars = 10000 * 1024 * 1024 f.write(b'0' * num_chars) def read_file(): with open(READ_FILE, mode="r", encoding="utf8") as f: mm = mmap.mmap(f.fileno(), length=0, access=mmap.ACCESS_READ) text = mm.read() write_files() WRITE_FILE = "file" WRITE_DIR = "/tmp/rm_rf_me" READ_FILE="/tmp/10g_file_delete" if not os.path.isfile(READ_FILE): create_read_file() read_file() Jianlin > > > In particular > > case, users wish to be able to completely disable swap for specific > > processes. One scenario is that if JVM memory pages falls into swap, > > the performance will noticeably reduce and the GC pauses tend to increase > > to levels not tolerable by most applications. > > If it's possible to only disable swap out for specific processes, it can > > address the JVM GC pauses issues, and at the same time, memory reclaim > > pressure is also manageable. > > > > This patch adds "memory.swap_force_disable" control file to support disable > > swap for non-root cgroup. When process is associated with a cgroup, > > 'echo 1 > memory.swap_force_disable' will forbid anon pages be swapped out. > > This patch also adds read and write handler of the control file. > > -- > Best Regards, > Huang, Ying