On 1/24/22 22:01, Muchun Song wrote: > On Tue, Jan 25, 2022 at 10:42 AM Zi Yan <ziy@xxxxxxxxxx> wrote: >> >> On 24 Jan 2022, at 20:55, Muchun Song wrote: >> >>> On Tue, Jan 25, 2022 at 3:22 AM Zi Yan <ziy@xxxxxxxxxx> wrote: >>>> >>>> On 24 Jan 2022, at 13:11, David Rientjes wrote: >>>> >>>>> On Mon, 24 Jan 2022, Muchun Song wrote: >>>>> >>>>>> The D-cache maintenance inside move_to_new_page() only consider one page, >>>>>> there is still D-cache maintenance issue for tail pages of THP. Fix this >>>>>> by not using flush_dcache_folio() since it is not backportable. >>>>>> >>>>> >>>>> The mention of being backportable suggests that we should backport this, >>>>> likely to 4.14+. So should it be marked as stable? >>>> >>>> Hmm, after more digging, I am not sure if the bug exists. For THP migration, >>>> flush_cache_range() is used in remove_migration_pmd(). The flush_dcache_page() >>>> was added by Lars Persson (cc’d) to solve the data corruption on MIPS[1], >>>> but THP migration is only enabled on x86_64, PPC_BOOK3S_64, and ARM64. >>> >>> I only mention the THP case. After some more thinking, I think the HugeTLB >>> should also be considered, Right? The HugeTLB is enabled on arm, arm64, >>> mips, parisc, powerpc, riscv, s390 and sh. >>> >> >> +Mike for HugeTLB >> >> If HugeTLB page migration also misses flush_dcache_page() on its tail pages, >> you will need a different patch for the commit introducing hugetlb page migration. > > Agree. I think arm (see the following commit) has handled this issue, while most > others do not. > > commit 0b19f93351dd ("ARM: mm: Add support for flushing HugeTLB pages.") > > But I do not have any real devices to test if this issue exists on other archs. > In theory, it exists. > Thanks for adding me to the discussion. I agree that this issue exists at least in theory for hugetlb pages as well. This made me look at other places with similar code for hugetlb. i.e. Allocating a new page, copying data to new page and then establishing a mapping (pte) to the new page. - hugetlb_cow calls copy_user_huge_page() which ends up calling copy_user_highpage that includes dcache flushing of the target for some architectures, but not all. - userfaultfd calls copy_huge_page_from_user which does not appear to do any dcache flushing for the target page. Do you think these code paths have the same potential issue? -- Mike Kravetz