On 14.04.14 10:08, liu ping fan wrote:
On Mon, Apr 14, 2014 at 2:43 PM, Alexander Graf <agraf@xxxxxxx> wrote:
On 13.04.14 04:27, Liu ping fan wrote:
On Fri, Apr 11, 2014 at 10:03 PM, Alexander Graf <agraf@xxxxxxx> wrote:
On 11.04.2014, at 13:45, Liu Ping Fan <pingfank@xxxxxxxxxxxxxxxxxx>
wrote:
When we mark pte with _PAGE_NUMA we already call
mmu_notifier_invalidate_range_start
and mmu_notifier_invalidate_range_end, which will mark existing guest
hpte
entry as HPTE_V_ABSENT. Now we need to do that when we are inserting new
guest hpte entries.
What happens when we don't? Why do we need the check? Why isn't it done
implicitly? What happens when we treat a NUMA marked page as non-present?
Why does it work out for us?
Assume you have no idea what PAGE_NUMA is, but try to figure out what
this patch does and whether you need to cherry-pick it into your downstream
kernel. The description as is still is not very helpful for that. It doesn't
even explain what really changes with this patch applied.
Yeah. what about appending the following description? Can it make
the context clear?
"Guest should not setup a hpte for the page whose pte is marked with
_PAGE_NUMA, so on the host, the numa-fault mechanism can take effect
to check whether the page is placed correctly or not."
Try to come up with a text that answers the following questions in order:
I divide them into 3 groups, and answer them by 3 sections. Seems that
it has the total story :)
Please take a look.
- What does _PAGE_NUMA mean?
Group 1 -> section 2
- How does page migration with _PAGE_NUMA work?
-> Why should we not map pages when _PAGE_NUMA is set?
Group 2 -> section 1
(Note: for the 1st question in this group, I am not sure about the
details, except that we can fix numa balancing by moving task or
moving page. So I comment as " migration should be involved to cut
down the distance between the cpu and pages")
- Which part of what needs to be done did the previous _PAGE_NUMA patch
address?
- What's the situation without this patch?
- Which scenario does this patch fix?
Group 3 -> section 3
Numa fault is a method which help to achieve auto numa balancing.
When such a page fault takes place, the page fault handler will check
whether the page is placed correctly. If not, migration should be
involved to cut down the distance between the cpu and pages.
A pte with _PAGE_NUMA help to implement numa fault. It means not to
allow the MMU to access the page directly. So a page fault is triggered
and numa fault handler gets the opportunity to run checker.
As for the access of MMU, we need special handling for the powernv's guest.
When we mark a pte with _PAGE_NUMA, we already call mmu_notifier to
invalidate it in guest's htab, but when we tried to re-insert them,
we firstly try to fix it in real-mode. Only after this fails, we fallback
to virt mode, and most of important, we run numa fault handler in virt
mode. This patch guards the way of real-mode to ensure that if a pte is
marked with _PAGE_NUMA, it will NOT be fixed in real mode, instead, it will
be fixed in virt mode and have the opportunity to be checked with placement.
s/fixed/mapped/g
Otherwise works as patch description for me :).
Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html