On Tue, Mar 22, 2022 at 1:23 AM Barry Song <21cnbao@xxxxxxxxx> wrote: > > On Wed, Mar 9, 2022 at 3:48 PM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > > > Add /sys/kernel/mm/lru_gen/min_ttl_ms for thrashing prevention, as > > requested by many desktop users [1]. > > > > When set to value N, it prevents the working set of N milliseconds > > from getting evicted. The OOM killer is triggered if this working set > > cannot be kept in memory. Based on the average human detectable lag > > (~100ms), N=1000 usually eliminates intolerable lags due to thrashing. > > Larger values like N=3000 make lags less noticeable at the risk of > > premature OOM kills. > > > > Compared with the size-based approach, e.g., [2], this time-based > > approach has the following advantages: > > 1. It is easier to configure because it is agnostic to applications > > and memory sizes. > > 2. It is more reliable because it is directly wired to the OOM killer. > > > > how are userspace oom daemons like android lmkd, systemd-oomd supposed > to work with this time-based oom killer? > only one of min_ttl_ms and userspace daemon should be enabled? or both > should be enabled at the same time? Generally we just need one. lmkd and oomd are more flexible but 1) they need customizations 2) not all distros have them 3) they might be stuck in direct reclaim as well. The last remark is not just a theoretical problem: a) we had many servers under extremely heavy (global) memory pressure, that 200+ direct reclaimers on each CPU competed for resources and userspace livelocked for 2 hours. Eventually hardware watchdogs kicked in. b) on Chromebooks we have something similar to lmkd, and we still frequently observe crashes due to heavy memory pressure, meaning some Chrome tabs were stuck in direct reclaim for 120 seconds (hung_task_timeout_secs=120).