On Thu, Apr 13, 2023 at 3:43 PM Kalesh Singh <kaleshsingh@xxxxxxxxxx> wrote: > > Android 14 and later default to MGLRU [1] and field telemetry showed > occasional long tail latency (>100ms) in the reclaim path. > > Tracing revealed priority inversion in the reclaim path. In > try_to_inc_max_seq(), when high priority tasks were blocked on > wait_event_killable(), the preemption of the low priority task to call > wake_up_all() caused those high priority tasks to wait longer than > necessary. In general, this problem is not different from others of > its kind, e.g., one caused by mutex_lock(). However, it is specific to > MGLRU because it introduced the new wait queue lruvec->mm_state.wait. > > The purpose of this new wait queue is to avoid the thundering herd > problem. If many direct reclaimers rush into try_to_inc_max_seq(), > only one can succeed, i.e., the one to wake up the rest, and the rest > who failed might cause premature OOM kills if they do not wait. So far > there is no evidence supporting this scenario, based on how often the > wait has been hit. And this begs the question how useful the wait > queue is in practice. > > Based on Minchan's recommendation, which is in line with his commit > 6d4675e60135 ("mm: don't be stuck to rmap lock on reclaim path") and > the rest of the MGLRU code which also uses trylock when possible, > remove the wait queue. > > [1] https://android-review.googlesource.com/q/I7ed7fbfd6ef9ce10053347528125dd98c39e50bf > > Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") > Cc: Yu Zhao <yuzhao@xxxxxxxxxx> > Cc: Minchan Kim <minchan@xxxxxxxxxx> > Reported-by: Wei Wang <wvw@xxxxxxxxxx> > Suggested-by: Minchan Kim <minchan@xxxxxxxxxx> > Signed-off-by: Kalesh Singh <kaleshsingh@xxxxxxxxxx> Acked-by: Yu Zhao <yuzhao@xxxxxxxxxx>