On Wed, 14 Jun 2017 16:51:43 +0300 "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > Until pmdp_invalidate() pmd entry is present and CPU can update it, > setting dirty. Currently, we tranfer dirty bit to page too early and > there is window when we can miss dirty bit. > > Let's call SetPageDirty() after pmdp_invalidate(). > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > ... > @@ -2046,6 +2043,14 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, > * pmd_populate. > */ > pmdp_invalidate(vma, haddr, pmd); > + > + /* > + * Transfer dirty bit to page after pmd invalidated, so CPU would not > + * be able to set it under us. > + */ > + if (pmd_dirty(*pmd)) > + SetPageDirty(page); > + > pmd_populate(mm, pmd, pgtable); > > if (freeze) { That won't work on s390. After pmdp_invalidate the pmd entry is gone, it has been replaced with _SEGMENT_ENTRY_EMPTY. This includes the dirty and referenced bits. The old scheme is entry = *pmd; pmdp_invalidate(vma, addr, pmd); if (pmd_dirty(entry)) ... Could we change pmdp_invalidate to make it return the old pmd entry? The pmdp_xchg_direct function already returns it, for s390 that would be an easy change. The above code snippet would change like this: entry = pmdp_invalidate(vma, addr, pmd); if (pmd_dirty(entry)) ... -- blue skies, Martin. "Reality continues to ruin my life." - Calvin.