On 5/14/19 9:07 AM, Peter Zijlstra wrote:
On Mon, May 13, 2019 at 11:13:34AM -0700, Andy Lutomirski wrote:
On Mon, May 13, 2019 at 9:28 AM Alexandre Chartre
<alexandre.chartre@xxxxxxxxxx> wrote:
Actually, I am not sure this is effectively useful because the IRQ
handler is probably faulting before it tries to exit isolation, so
the isolation exit will be done by the kvm page fault handler. I need
to check that.
The whole idea of having #PF exit with a different CR3 than was loaded
on entry seems questionable to me. I'd be a lot more comfortable with
the whole idea if a page fault due to accessing the wrong data was an
OOPS and the code instead just did the right thing directly.
So I've ran into this idea before; it basically allows a lazy approach
to things.
I'm somewhat conflicted on things, on the one hand, changing CR3 from
#PF is a natural extention in that #PF already changes page-tables (for
userspace / vmalloc etc..), on the other hand, there's a thin line
between being lazy and being sloppy.
If we're going down this route; I think we need a very coherent design
and strong rules.
Right. We should particularly ensure that the KVM page-table remains a
subset of the kernel page-table, in particular page-table changes (e.g.
for vmalloc etc...) should happen in the kernel page-table and not in
the kvm page-table.
So we should probably enforce switching to the kernel page-table when
doing operation like vmalloc. The current code doesn't enforce it, but
I can see it faulting, when doing any allocation (because the kvm page
table doesn't have all structures used during an allocation).
alex.