On Thu, Oct 29, 2020 at 06:45:00PM +0800, Alex Shi wrote: > Currently lru_lock still guards both lru list and page's lru bit, that's > ok. but if we want to use specific lruvec lock on the page, we need to > pin down the page's lruvec/memcg during locking. Just taking lruvec > lock first may be undermined by the page's memcg charge/migration. To > fix this problem, we could clear the lru bit out of locking and use > it as pin down action to block the page isolation in memcg changing. Small nit, but the use of "could" in this sentence sounds like you're describing one possible solution that isn't being taken, when in fact you are describing the chosen locking mechanism. Replacing "could" with "will" would make things a bit clearer IMO. > So now a standard steps of page isolation is following: > 1, get_page(); #pin the page avoid to be free > 2, TestClearPageLRU(); #block other isolation like memcg change > 3, spin_lock on lru_lock; #serialize lru list access > 4, delete page from lru list; > The step 2 could be optimzed/replaced in scenarios which page is > unlikely be accessed or be moved between memcgs. This is a bit ominous. I'd either elaborate / provide an example / clarify why some sites can deal with races - or just remove that sentence altogether from this part of the changelog. > This patch start with the first part: TestClearPageLRU, which combines > PageLRU check and ClearPageLRU into a macro func TestClearPageLRU. This > function will be used as page isolation precondition to prevent other > isolations some where else. Then there are may !PageLRU page on lru > list, need to remove BUG() checking accordingly. > > There 2 rules for lru bit now: > 1, the lru bit still indicate if a page on lru list, just in some > temporary moment(isolating), the page may have no lru bit when > it's on lru list. but the page still must be on lru list when the > lru bit set. > 2, have to remove lru bit before delete it from lru list. > > As Andrew Morton mentioned this change would dirty cacheline for page > isn't on LRU. But the lost would be acceptable in Rong Chen > <rong.a.chen@xxxxxxxxx> report: > https://lore.kernel.org/lkml/20200304090301.GB5972@shao2-debian/ > > Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx> > Signed-off-by: Alex Shi <alex.shi@xxxxxxxxxxxxxxxxx> > Acked-by: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxxxx> > Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: linux-kernel@xxxxxxxxxxxxxxx > Cc: cgroups@xxxxxxxxxxxxxxx > Cc: linux-mm@xxxxxxxxx Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>