Re: [RFC PATCH 00/73] KVM: x86/PVM: Introduce a new hypervisor

Lai Jiangshan <jiangshanlai@xxxxxxxxx> · Thu, 29 Feb 2024 22:55:28 +0800

Hello, Paolo

On Mon, Feb 26, 2024 at 10:49 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On Mon, Feb 26, 2024 at 3:34 PM Lai Jiangshan <jiangshanlai@xxxxxxxxx> wrote:
> > - Full control: In XENPV/Lguest, the host Linux (dom0) entry code is
> >   subordinate to the hypervisor/switcher, and the host Linux kernel
> >   loses control over the entry code. This can cause inconvenience if
> >   there is a need to update something when there is a bug in the
> >   switcher or hardware.  Integral entry gives the control back to the
> >   host kernel.
> >
> > - Zero overhead incurred: The integrated entry code doesn't cause any
> >   overhead in host Linux entry path, thanks to the discreet design with
> >   PVM code in the switcher, where the PVM path is bypassed on host events.
> >   While in XENPV/Lguest, host events must be handled by the
> >   hypervisor/switcher before being processed.
>
> Lguest... Now that's a name I haven't heard in a long time. :)  To be
> honest, it's a bit weird to see yet another PV hypervisor. I think
> what really killed Xen PV was the impossibility to protect from
> various speculation side channel attacks, and I would like to
> understand how PVM fares here.

How does the host kernel protect itself from guest's speculation side
channel attacks?

PVM is primarily designed for secure containers like Kata containers,
where safety and security are of utmost importance.

Guests run in the hardware ring3 and they are treated as the same as the
normal user applications in the views of the host kernel's protections
and mitigations. The code employs all of the current protections and
mitigations for kernel/user interactions to host/guest and with extra
protections from pagetable isolation and with protections/mitigations
usually used for host/VTX_or_AMDV_guest (with some similar VM enter/exit
code as in vmx/ svm/). All of these are sorta easily achieved by the
"integral entry" design and "the distinct separation of the address
spaces" design can also help for protections.

How does the guest kernel protect itself from guest users' speculation
side channel attacks?

The code also tries its best to provide all of the current protections
and mitigations between the native kernel/user for virtualized kernel/user.
It is obvious that the PVM virtualized kernel operates in hardware ring3
and you can't expect all methods can be effective. Since the linux kernel
can provide protections for threads switching between different user
processes, PVM can potentially offer similar protections between guest
kernel/user through the PVM hypervisor's support.

I'm not familiar with XENPV's handling and its solutions (including its
impossibility) for the speculation side channel attacks, thus I cannot
provide additional insights or assurances in this context.

PVM is not designed as a general-purpose virtualization. The primary
objective is for secure container and Linux kernel testing. PVM intends
to allow for the universal deployment of Kata Containers inside cloud
VMs leased from any provider over the world.

For Kata containers, the protection between host/guest is much more
important and every container is only for a single tenement in which
the guest kernel is not a TCB of the external container services.
It means the protection requirements between guest kernel/user are
more flexible and customized.

>
> You obviously did a great job in implementing this within the KVM
> framework; the changes in arch/x86/ are impressively small.

Thanks for your appreciation.

> On the
> other hand this means it's also not really my call to decide whether
> this is suitable for merging upstream. The bulk of the changes are
> really in arch/x86/kernel/ and arch/x86/entry/, and those are well
> outside my maintenance area.
>

Thanks
Lai