On Wed, Oct 23, 2024 at 6:27 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Wed, 23 Oct 2024 03:24:38 +0800 Kairui Song <ryncsn@xxxxxxxxx> wrote: > > > After this series, lock contention on si->lock is nearly unobservable > > with `perf lock` with the same test above : > > > > contended total wait max wait avg wait type caller > > ... snip ... > > 91 204.62 us 4.51 us 2.25 us spinlock cluster_move+0x2e > > ... snip ... > > 47 125.62 us 4.47 us 2.67 us spinlock cluster_move+0x2e > > ... snip ... > > 23 63.15 us 3.95 us 2.74 us spinlock cluster_move+0x2e > > ... snip ... > > 17 41.26 us 4.58 us 2.43 us spinlock cluster_isolate_lock+0x1d > > ... snip ... > > Were any overall runtime benefits observed? Yes, see the "Tests" results in the cover letter (summary: up to 50% build time saved for build linux kernel test when under pressure, with either mTHP or 4K pages): time make -j96 / 768M memcg, 4K pages, 10G ZRAM, on Intel 8255C * 2 in VM: (avg of 4 test run) Before: Sys time: 73578.30, Real time: 864.05 After: (-54.7% sys time, -49.3% real time) Sys time: 33314.76, Real time: 437.67 time make -j96 / 1152M memcg, 64K mTHP, 10G ZRAM, on Intel 8255C * 2 in VM: (avg of 4 test run) Before: Sys time: 74044.85, Real time: 846.51 After: (-51.4% sys time, -47.7% real time, -63.2% mTHP failure) Sys time: 35958.87, Real time: 442.69 Tests on the host bare metal showed similar results. There are some other test results I didn't include in the cover letter for V1 yet and I'm still testing more scenarios, eg. mysql test in 1G memcg and with 96 workers and ZRAM swap: before: transactions: 755630 (6292.11 per sec.) queries: 12090080 (100673.69 per sec.) after: transactions: 1077156 (8972.73 per sec.) queries: 17234496 (143563.65 per sec.) ~30% faster. Also the mTHP swap allocation success rate is higher, I can highlight these changes in V2.