On Tue, May 28, 2019 at 08:15:23PM +0800, Hillf Danton wrote: < snip > > > > > + > > > > + get_page(page); > > > > + spin_unlock(ptl); > > > > + lock_page(page); > > > > + err = split_huge_page(page); > > > > + unlock_page(page); > > > > + put_page(page); > > > > + if (!err) > > > > + goto regular_page; > > > > + return 0; > > > > + } > > > > + > > > > + pmdp_test_and_clear_young(vma, addr, pmd); > > > > + deactivate_page(page); > > > > +huge_unlock: > > > > + spin_unlock(ptl); > > > > + return 0; > > > > + } > > > > + > > > > + if (pmd_trans_unstable(pmd)) > > > > + return 0; > > > > + > > > > +regular_page: > > > > > > Take a look at pending signal? > > > > Do you have any reason to see pending signal here? I want to know what's > > your requirement so that what's the better place to handle it. > > > We could bail out without work done IMO if there is a fatal siganl pending. > And we can do that, if it makes sense to you, before the hard work. Make sense, especically, swapping out. I will add it in next revision. > > > > > > > > + orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); > > > > + for (pte = orig_pte; addr < end; pte++, addr += PAGE_SIZE) { > > > > > > s/end/next/ ? > > > > Why do you think it should be next? > > > Simply based on the following line, and afraid that next != end > > > > + next = pmd_addr_end(addr, end); pmd_addr_end will return smaller address so end is more proper. > > > > > + ptent = *pte; > > > > + > > > > + if (pte_none(ptent)) > > > > + continue; > > > > + > > > > + if (!pte_present(ptent)) > > > > + continue; > > > > + > > > > + page = vm_normal_page(vma, addr, ptent); > > > > + if (!page) > > > > + continue; > > > > + > > > > + if (page_mapcount(page) > 1) > > > > + continue; > > > > + > > > > + ptep_test_and_clear_young(vma, addr, pte); > > > > + deactivate_page(page); > > > > + } > > > > + > > > > + pte_unmap_unlock(orig_pte, ptl); > > > > + cond_resched(); > > > > + > > > > + return 0; > > > > +} > > > > + > > > > +static long madvise_cool(struct vm_area_struct *vma, > > > > + unsigned long start_addr, unsigned long end_addr) > > > > +{ > > > > + struct mm_struct *mm = vma->vm_mm; > > > > + struct mmu_gather tlb; > > > > + > > > > + if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP)) > > > > + return -EINVAL; > > > > > > No service in case of VM_IO? > > > > I don't know VM_IO would have regular LRU pages but just follow normal > > convention for DONTNEED and FREE. > > Do you have anything in your mind? > > > I want to skip a mapping set up for DMA. What you meant is those pages in VM_IO vma are not in LRU list? Or pages in the vma are always pinned so no worth to deactivate or reclaim?