On 11/5/20 9:55 AM, Alex Shi wrote:
This patch moves per node lru_lock into lruvec, thus bring a lru_lock for each of memcg per node. So on a large machine, each of memcg don't have to suffer from per node pgdat->lru_lock competition. They could go fast with their self lru_lock. After move memcg charge before lru inserting, page isolation could serialize page's memcg, then per memcg lruvec lock is stable and could replace per node lru lock. In func isolate_migratepages_block, compact_unlock_should_abort and lock_page_lruvec_irqsave are open coded to work with compact_control. Also add a debug func in locking which may give some clues if there are sth out of hands. Daniel Jordan's testing show 62% improvement on modified readtwice case on his 2P * 10 core * 2 HT broadwell box. https://lore.kernel.org/lkml/20200915165807.kpp7uhiw7l3loofu@xxxxxxxxxxxxxxxxxxxxxxxxxx/ On a large machine with memcg enabled but not used, the page's lruvec seeking pass a few pointers, that may lead to lru_lock holding time increase and a bit regression. Hugh Dickins helped on the patch polish, thanks! Signed-off-by: Alex Shi <alex.shi@xxxxxxxxxxxxxxxxx> Acked-by: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Rong Chen <rong.a.chen@xxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> Cc: Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> Cc: Tejun Heo <tj@xxxxxxxxxx> Cc: linux-kernel@xxxxxxxxxxxxxxx Cc: linux-mm@xxxxxxxxx Cc: cgroups@xxxxxxxxxxxxxxx
Acked-by: Vlastimil Babka <vbabka@xxxxxxx>