Re: [PATCH] KVM: VMX: Reintroduce I/O port 0x80 bypass

Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> · Mon, 19 Mar 2018 12:53:48 -0400

On Mon, Mar 19, 2018 at 04:44:22PM +0000, Tim Shearer wrote:
> > From: Paolo Bonzini [mailto:pbonzini@xxxxxxxxxx]
> > Sent: Monday, March 19, 2018 11:59 AM
> > 
> > This obviously reintroduces the same issue noted in the second of these
> > commits: "If the guest floods this port with writes it generates
> > exceptions and instability in the host kernel, leading to a crash".  So
> > this patch is not acceptable.
> 
> Hi Paulo,
> 
> > 
> > What is exactly the use case where a VM is doing a lot of 0x80 accesses
> > at run-time?
> >
> 
> None specifically, but it is otherwise "normal" behavior of some VMs. Apparently it used to be a common method to synchronize writes to other I/O ports. In the commit thread for the original commit no-one was able to reproduce it. There are no details on what processors are impacted or exactly what the "exceptions and instability in the host kernel" were.

<blinks> There are OSes out there that use a "debug" port to synchronize
I/O port access? That seems ill-advised? What are those OSes?

> 
> In terms of performance, below is a the mpstat output (from the host) for a CPU performing L2 packet forwarding on a pinned guest VCPU. Interrupts on that core are disabled. You can see it switching in and out of userspace constantly.

<blinks> There is a paravirt option to not use port 0x80 for 'delay' function.
Is it possible that this OS is using port 0x80 for this? What OS/kernel is this?

> 
>     $ mpstat -P 5 5
>     11:32:03     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
>     Average:       5    6.91    0.00   26.98    0.00    0.00    0.00    0.00   65.78    0.00    0.33
>   
>   # sample trace output:
>        CPU 1/KVM-11359 [005] d...  2207.865724: kvm_entry: vcpu 1
>        CPU 1/KVM-11359 [005] ....  2207.865767: kvm_exit: reason IO_INSTRUCTION rip 0x483213 info c0700001 0
>        CPU 1/KVM-11359 [005] ....  2207.865768: kvm_pio: pio_write at 0xc070 size 2 count 1 val 0x0 
>        CPU 1/KVM-11359 [005] d...  2207.865769: kvm_entry: vcpu 1
>        CPU 1/KVM-11359 [005] ....  2207.865771: kvm_exit: reason IO_INSTRUCTION rip 0x483215 info 800040 0
>        CPU 1/KVM-11359 [005] ....  2207.865772: kvm_pio: pio_write at 0x80 size 1 count 1 val 0x0 
>        CPU 1/KVM-11359 [005] ....  2207.865773: kvm_fpu: unload
>        CPU 1/KVM-11359 [005] ....  2207.865774: kvm_userspace_exit: reason KVM_EXIT_IO (2)
> 
> According to perf, 0x80 writes can take over 2ms
> 
>   $ perf kvm stat report --event=ioport
> 
>       IO Port Access    Samples  Samples%     Time%    Min Time    Max Time         Avg time 
>            0x80:POUT     316206    49.99%    81.97%      9.41us   2195.77us     10.41us ( +-   0.14% )
>          0xc070:POUT     158623    25.08%     9.18%      1.99us     29.27us      2.32us ( +-   0.05% )
>          0xc090:POUT     157583    24.91%     8.84%      1.97us     36.90us      2.25us ( +-   0.05% )
>            0x608:PIN        150     0.02%     0.02%      3.83us      6.71us      4.26us ( +-   1.01% )
>          0xc010:POUT          1     0.00%     0.00%     16.19us     16.19us     16.19us ( +-   0.00% )
> 
> With the fix reverted, performance is restored (~95% guest-mode processing vs ~66%):
> 
>     $ mpstat -P 5 5
>     08:30:03 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
>     Average:       5    0.00    0.00    4.26    0.00    0.00    0.00    0.00   95.74    0.00    0.00
> 
> I understand that security/stability must take priority over performance. I would love to understand more about the original vulnerability though, because the performance cost is so high.

The original commit mentions it. It is Denial of Service.

> 
> Many thanks,
> 
> Tim