On Tue, Oct 26, 2021 at 02:30:25PM -0400, Pasha Tatashin wrote:
On Tue, Oct 26, 2021 at 2:24 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
I think this is overkill. Won't we get exactly the same protection
by simply testing that page->_refcount == 0 in set_page_count()?
Anything which triggers that BUG_ON would already be buggy because
it can race with speculative gets.
We can't because set_page_count(v) is used for
1. changing _refcount form a current value to unconstrained v
2. initialize _refcount from undefined state to v.
In this work we forbid the first case, and reduce the second case to
initialize only to 1.
Anything that is calling set_page_refcount() on something which is
not 0 is buggy today. There are several ways to increment the page
refcount speculatively if it is not 0. eg lockless GUP and page cache
reads. So we could have:
CPU 0: alloc_page() (refcount now 1)
CPU 1: get_page_unless_zero() (refcount now 2)
CPU 0: set_page_refcount(5) (refcount now 5)
CPU 1: put_page() (refcount now 4)
Now the refcount is wrong. So it is *only* safe to call
set_page_refcount() if the refcount is 0. If you can find somewhere
that's calling set_page_refcount() on a non-0 refcount, that's a bug
that needs to be fixed.