On Wed, Jan 26, 2022 at 5:24 AM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > On 1/24/22 22:01, Muchun Song wrote: > > On Tue, Jan 25, 2022 at 10:42 AM Zi Yan <ziy@xxxxxxxxxx> wrote: > >> > >> On 24 Jan 2022, at 20:55, Muchun Song wrote: > >> > >>> On Tue, Jan 25, 2022 at 3:22 AM Zi Yan <ziy@xxxxxxxxxx> wrote: > >>>> > >>>> On 24 Jan 2022, at 13:11, David Rientjes wrote: > >>>> > >>>>> On Mon, 24 Jan 2022, Muchun Song wrote: > >>>>> > >>>>>> The D-cache maintenance inside move_to_new_page() only consider one page, > >>>>>> there is still D-cache maintenance issue for tail pages of THP. Fix this > >>>>>> by not using flush_dcache_folio() since it is not backportable. > >>>>>> > >>>>> > >>>>> The mention of being backportable suggests that we should backport this, > >>>>> likely to 4.14+. So should it be marked as stable? > >>>> > >>>> Hmm, after more digging, I am not sure if the bug exists. For THP migration, > >>>> flush_cache_range() is used in remove_migration_pmd(). The flush_dcache_page() > >>>> was added by Lars Persson (cc’d) to solve the data corruption on MIPS[1], > >>>> but THP migration is only enabled on x86_64, PPC_BOOK3S_64, and ARM64. > >>> > >>> I only mention the THP case. After some more thinking, I think the HugeTLB > >>> should also be considered, Right? The HugeTLB is enabled on arm, arm64, > >>> mips, parisc, powerpc, riscv, s390 and sh. > >>> > >> > >> +Mike for HugeTLB > >> > >> If HugeTLB page migration also misses flush_dcache_page() on its tail pages, > >> you will need a different patch for the commit introducing hugetlb page migration. > > > > Agree. I think arm (see the following commit) has handled this issue, while most > > others do not. > > > > commit 0b19f93351dd ("ARM: mm: Add support for flushing HugeTLB pages.") > > > > But I do not have any real devices to test if this issue exists on other archs. > > In theory, it exists. > > > > Thanks for adding me to the discussion. > > I agree that this issue exists at least in theory for hugetlb pages as well. > This made me look at other places with similar code for hugetlb. i.e. > Allocating a new page, copying data to new page and then establishing a > mapping (pte) to the new page. Hi Mike, Thanks for looking at this. > > - hugetlb_cow calls copy_user_huge_page() which ends up calling > copy_user_highpage that includes dcache flushing of the target for some > architectures, but not all. copy_user_page() inside copy_user_highpage() is already considering the cache maintenance on different architectures, which is documented in Documentation/core-api/cachetlb.rst. So there are no problems in this case. > - userfaultfd calls copy_huge_page_from_user which does not appear to do > any dcache flushing for the target page. Right. The new page should be flushed before setting up the mapping to the user space. > Do you think these code paths have the same potential issue? The latter does have the issue, the former does not. The fixes may look like the following: diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a1baa198519a..828240aee3f9 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5819,6 +5819,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, goto out; } folio_copy(page_folio(page), page_folio(*pagep)); + flush_dcache_folio(page_folio(page)); put_page(*pagep); *pagep = NULL; } diff --git a/mm/memory.c b/mm/memory.c index e8ce066be5f2..ff6f48cdcc48 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5400,6 +5400,7 @@ long copy_huge_page_from_user(struct page *dst_page, kunmap(subpage); else kunmap_atomic(page_kaddr); + flush_dcache_page(subpage); ret_val -= (PAGE_SIZE - rc); if (rc) Thanks.