Re: [patch 019/154] mm: make madvise(MADV_WILLNEED) support swap file prefetch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 15, 2013 at 2:03 PM, Sasha Levin <sasha.levin@xxxxxxxxxx> wrote:
> On 12/15/2013 02:16 PM, Linus Torvalds wrote:
>>
>> Can anybody see what's wrong with that code? It all seems to happen
>> with mmap_sem held for reading, so there is no mmap activity going on,
>> but what about concurrent pmd splitting due to page faults etc?
>
> There's one thing that seems odd to me: the only place that allocated
> the ptl is in pgtable_page_ctor() called from pte_alloc_one().
>
> However, I don't see how ptl is allocated through all the
> mk_pmd()/mk_huge_pmd()
> calls in mm/huge_memory.c .
>
> I've added some debug output, and it seems that indeed the results of
> mk_pmd() are
> with ptl == NULL and one of them ends up getting to swapin_walk_pmd_entry
> where it NULL
> ptr derefs.

Hmm. I don't see that in my tree either, so that doesn't seem to be a
linux-next issue.

How are we not hitting this left and right? Sure, you need spinlock
debugging or something like that to trigger the BLOATED_SPINLOCKS
case, and you'd need the USE_SPLIT_PTE_PTLOCKS case to have this at
all, but that shouldn't be *that* unusual. And afaik, we should hit
this on just about any page table traversal.

So I *think* the rule is that largepages don't have ptl entries (since
they don't have page tables associated with them), and they need to be
handled specially.

But it's also possibly just that maybe nothing really uses
large-pages. And afaik, we used to disable USE_SPLIT_PTE_PTLOCKS
entirely with big spinlocks until Kirill added that indirection
pointer, so that would explain why we just never noticed this issue
before (although I'd have expected that the spinlock still needs to be
initialized, even if it doesn't need allocating - otherwise we'd
possibly just hang on a "spin_lock()" that never succeeds).

Adding Kirill to the participants, since he did the
pgtable_pmd_page_ctor/dtor stuff and enabled split PTE locks even with
BLOATED_SPINLOCKS. And Andrea, since largepages are involved. And
linux-mm just to have *some* list cc'd.

Kirill? Sasha seems to trigger this problem with
madvise(MADV_WILLNEED), possibly on a hugepage mapping (but see
below..) The

        orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, start, &ptl);

in swapin_walk_pmd_entry() ends up taking a NULL ptr fault because the
pmd doesn't have a ptl pointer..

But why would we trigger this bug then, since we have:

        if (pmd_none_or_trans_huge_or_clear_bad(pmd))
                return 0;

in swapin_walk_pmd_entry(). Possibly racing with a page-in? Should we
check the "vma->vm_flags" for VM_HUGETLB?

Let's hope the new people have more answers than questions ;)

                  Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]