On Tue, May 14, 2019 at 02:25:27PM -0700, Andrew Morton wrote: > On Tue, 9 Jan 2018 02:10:50 -0800 Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > > > Also what prevents reordering here? There do not seem to be any barriers > > > to prevent __SetPageSwapBacked leak after set_pte_at with your patch. > > > > I assumed mem_cgroup_commit_charge() acted as full barrier. Since you > > explicitly asked the question, I realized my assumption doesn't hold > > when memcg is disabled. So we do need something to prevent reordering > > in my patch. And it brings up the question whether we want to add more > > barrier to other places that call page_add_new_anon_rmap() and > > set_pte_at(). > > Is a new version of this patch planned? Sorry for the late reply. The last time I tried, I didn't come up with a better fix because: 1) as Michal pointed out, we need to make sure the fast gup sees all changes made before set_pte_at(); 2) pairing smp_wmb() in set_pte/pmd_at() with smp_rmb() in gup seems the best way to prevent any potential ordering related problems in the future; 3) but this slows down the paths that don't require the smp_mwb() unnecessarily. I didn't give it further thought because the problem doesn't seem fatal at the time. Now the fast gup has changed and the problem is serious: CPU 1 CPU 1 set_pte_at get_user_pages_fast page_add_new_anon_rmap gup_pte_range __SetPageSwapBacked (fetch) try_get_compound_head page_ref_add_unless __SetPageSwapBacked (store) Or the similar problem could happen to __do_huge_pmd_anonymous_page(), for the reason of missing smp_wmb() between the non-atomic bit op and set_pmd_at(). We could simply replace __SetPageSwapBacked() with its atomic version. But 2) seems more preferable to me because it addresses my original problem: > > I didn't observe the race directly. But I did get few crashes when > > trying to access mem_cgroup of pages returned by get_user_pages_fast(). > > Those page were charged and they showed valid mem_cgroup in kdumps. > > So this led me to think the problem came from premature set_pte_at(). Thoughts? Thanks.