On Fri, Feb 4, 2022 at 4:03 AM John Hubbard <jhubbard@xxxxxxxxxx> wrote: > > On 2/3/22 21:56, Yu Zhao wrote: > ... > >>> Got it. IIRC, get_user_pages() doesn't imply a write barrier. If so, > >>> there should be a smp_wmb() on the other side: > >> > >> If I understand it correctly, it actually implies a full memory > >> barrier, doesn't it? > >> > >> Because... gup_pte_range() (fast path) calls try_grab_compound_head(), > >> which eventually calls* atomic_add_unless(), an atomic conditional RMW > >> operation with return value, thus fully ordered on success (atomic_t.rst); > >> (on failure gup_pte_range() falls back to the slow path, below.) > >> > >> And follow_page_pte() (slow path) calls try_grab_page(), which also calls > >> into try_grab_compound_head(), as the above. > > Well, doing so was a mistake, actually. I've recently reverted it, via: > commit c36c04c2e132 ("Revert "mm/gup: small refactoring: simplify > try_grab_page()""). Details are in the commit log. > > Apologies for the confusion that this may have created. No worries; thanks for the pointers / commit log. > > thanks, > -- > John Hubbard > NVIDIA > [...] -- Mauricio Faria de Oliveira