Hi, On Mon, Dec 16, 2013 at 10:18:39AM -0500, Sasha Levin wrote: > On 12/16/2013 07:47 AM, Kirill A. Shutemov wrote: > > I probably miss some context here. Do you have crash on some use-case or > > what? Could you point me to start of discussion. > > Yes, Sorry, here's the crash that started this discussion originally: > > The code points to: > At this point pmd_none_or_trans_huge_or_clear_bad guaranteed us the pmd points to a regular pte. And in turn the *pmd value is stable and cannot change from under us as long as we hold the mmap_sem for reading (writing not required). pmd_none_or_trans_huge_or_clear_bad implements a proper barrier() to be sure to check a single snapshot of the pmdval, and we read it atomically on 32bit archs too. (64bit always relies on gcc everywhere to access pagetables in a single instruction, including when we write pagetables, or the CPU could also get confused during TLB miss) Hmm we can optimize away the barrier() with an ACCESS_ONCE(*pmdp), but it's not related to this, the full barrier() is safer if something. > for (index = start; index != end; index += PAGE_SIZE) { > pte_t pte; > swp_entry_t entry; > struct page *page; > spinlock_t *ptl; > > orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, start, &ptl); <=== HERE > pte = *(orig_pte + ((index - start) / PAGE_SIZE)); > pte_unmap_unlock(orig_pte, ptl); This code looks weird, why is it doing the math of index-start/PAGE_SIZE when it could just pass "index" instead of "start" to pte_offset_map_lock. It actually looks safe but this is more complex for nothing. It should simply do: orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, index, &ptl); pte = *orig_pte; pte_unmap_unlock(orig_pte, ptl); Is the bug reproducible? If yes the simplest is probably to add some allocation tracking to the page, so if page->ptl is null we can simply print a stack trace of who allocated the page (and later forgot to initialize the ptl). /* Reset page->mapping so free_pages_check won't complain. */ static inline void pte_lock_deinit(struct page *page) { page->mapping = NULL; ptlock_free(page); } btw, page->mapping = NULL should be removed, that most certainly comes from older kernels when page->mapping was in the same union with page->ptl. page->mapping of pagetables should stay zero at all times. Agree with Kirill that it would help to verify the bug goes away by disabling USE_SPLIT_PTE_PTLOCKS. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>