On Mon, Sep 11, 2023 at 12:34 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Mon 11-09-23 12:15:24, Wei Xu wrote: > > On Mon, Sep 11, 2023 at 6:11 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > On Thu 07-09-23 17:52:12, Wei Xu wrote: > > > [...] > > > > I tested this patch on a machine with 384 CPUs using a microbenchmark > > > > that spawns 10K threads, each reading its memory.stat every 100 > > > > milliseconds. > > > > > > This is rather extreme case but I wouldn't call it utterly insane > > > though. > > > > > > > Most of memory.stat reads take 5ms-10ms in kernel, with > > > > ~5% reads even exceeding 1 second. > > > > > > Just curious, what would numbers look like if the mutex is removed and > > > those threads would be condending on the existing spinlock with lock > > > dropping in place and removed. Would you be willing to give it a shot? > > > > Without the mutex and with the spinlock only, the common read latency > > of memory.stat is still 5ms-10ms in kernel. There are very few reads > > (<0.003%) going above 10ms and none more than 1 second. > > Is this with the existing spinlock dropping and same 10k potentially > contending readers? Yes, it is the same test (10K contending readers). The kernel change is to remove stats_user_flush_mutex from mem_cgroup_user_flush_stats() so that the concurrent mem_cgroup_user_flush_stats() requests directly contend on cgroup_rstat_lock in cgroup_rstat_flush(). > -- > Michal Hocko > SUSE Labs