On Wednesday 14 June 2017 07:21 PM, Kirill A. Shutemov wrote:
Until pmdp_invalidate() pmd entry is present and CPU can update it, setting dirty. Currently, we tranfer dirty bit to page too early and there is window when we can miss dirty bit. Let's call SetPageDirty() after pmdp_invalidate(). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> --- mm/huge_memory.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a84909cf20d3..c4ee5c890910 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1928,7 +1928,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, struct page *page; pgtable_t pgtable; pmd_t _pmd; - bool young, write, dirty, soft_dirty; + bool young, write, soft_dirty; unsigned long addr; int i; @@ -1965,7 +1965,6 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, page_ref_add(page, HPAGE_PMD_NR - 1); write = pmd_write(*pmd); young = pmd_young(*pmd); - dirty = pmd_dirty(*pmd); soft_dirty = pmd_soft_dirty(*pmd); pmdp_huge_split_prepare(vma, haddr, pmd); @@ -1995,8 +1994,6 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, if (soft_dirty) entry = pte_mksoft_dirty(entry); } - if (dirty) - SetPageDirty(page + i); pte = pte_offset_map(&_pmd, addr); BUG_ON(!pte_none(*pte)); set_pte_at(mm, addr, pte, entry); @@ -2046,6 +2043,14 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, * pmd_populate. */ pmdp_invalidate(vma, haddr, pmd); + + /* + * Transfer dirty bit to page after pmd invalidated, so CPU would not + * be able to set it under us. + */ + if (pmd_dirty(*pmd)) + SetPageDirty(page); + pmd_populate(mm, pmd, pgtable);
you fixed dirty bit loosing i discussed in my previous mail here. thanks -aneesh