Re: [PATCH V2] mm: Recheck page table entry with page table lock held

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> 于2018年9月26日周三 上午11:19写道:
We clear the pte temporarily during read/modify/write update of the pte. If we
take a page fault while the pte is cleared, the application can get SIGBUS. One
such case is with remap_pfn_range without a backing vm_ops->fault callback.
do_fault will return SIGBUS in that case.
what is " remap_pfn_range without a backing vm_ops->fault callback ", would you like  elaborate the scenario? 
 is it the case using remap_pfn_range()  in drivers mmap() file operations?
if in that case, why it will trap into do_fault?

cpu 0                                           cpu1
mprotect()
ptep_modify_prot_start()/pte cleared.
.
.                                               page fault.
.
.
prep_modify_prot_commit()

  i am confusing this  scenario, when CPU0 will call in change_pte_range()->ptep_modify_prot_start() to clear the pte content, and 
on the other thread, in handle_pte_fault(), pte_offset_map() can get the pte, and the pte is not invalid, it's pte is valid but just the content is all zero, so why it will call into do_fault?

in  handle_pte_fault(): 
    vmf->pte = pte_offset_map(vmf->pmd, vmf->address);
    if (!vmf->pte) {
            return do_fault(vmf);
    }
 
 

 
Fix this by taking page table lock and rechecking for pte_none.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
---
V1:
* update commit message.

 mm/memory.c | 31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index c467102a5cbc..c2f933184303 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3745,10 +3745,33 @@ static vm_fault_t do_fault(struct vm_fault *vmf)
        struct vm_area_struct *vma = vmf->vma;
        vm_fault_t ret;

-       /* The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */
-       if (!vma->vm_ops->fault)
-               ret = VM_FAULT_SIGBUS;
-       else if (!(vmf->flags & FAULT_FLAG_WRITE))
+       /*
+        * The VMA was not fully populated on mmap() or missing VM_DONTEXPAND
+        */
+       if (!vma->vm_ops->fault) {
+
+               /*
+                * pmd entries won't be marked none during a R/M/W cycle.
+                */
+               if (unlikely(pmd_none(*vmf->pmd)))
+                       ret = VM_FAULT_SIGBUS;
+               else {
+                       vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+                       /*
+                        * Make sure this is not a temporary clearing of pte
+                        * by holding ptl and checking again. A R/M/W update
+                        * of pte involves: take ptl, clearing the pte so that
+                        * we don't have concurrent modification by hardware
+                        * followed by an update.
+                        */
+                       spin_lock(vmf->ptl);
+                       if (unlikely(pte_none(*vmf->pte)))
+                               ret = VM_FAULT_SIGBUS;
+                       else
+                               ret = VM_FAULT_NOPAGE;
+                       spin_unlock(vmf->ptl);
+               }
+       } else if (!(vmf->flags & FAULT_FLAG_WRITE))
                ret = do_read_fault(vmf);
        else if (!(vma->vm_flags & VM_SHARED))
                ret = do_cow_fault(vmf);
--
2.17.1


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux