VMI Interface Proposal Documentation for I386, Part 4

zach at vmware.com (Zachary Amsden) · Wed Mar 15 23:01:51 2006

Pavel Machek wrote:
> Hi!
>
>   
>>     6) Interrupts must always be enabled when running code in userspace.
>>     
>
> I'd say this breaks userspace.
>   

I agree.  My claim is that this is not an issue in a virtual machine.  
What possible reason can you have to disable interrupts in userspace?  
Well, several.  For one, the X server wants to disable interrupts 
temporarily during probing of dot clocks to get accurate timings, and 
also to avoid the kernel interrupting during a sensitive VGA register 
access.  Several other userspace programs, including CMOS time sync 
utilities do this as well.  I contend this is broken, even on native 
hardware, for two reasons.

1) The sensitive VGA register access argument is bogus.  There is 
already a kernel interface that is used by X11 to take control of video 
which lets the kernel know explicitly not to touch the VGA registers.  
The oddity is due to the fact that there are many write only registers, 
and thus, you can't track state of these without explicit handoff.  The 
same interface can be used to avoid these sensitive accesses.

2) Timing dot clocks by disabling interrupts is still broken and subject 
to random variance.  Chipsets which support system management modes can 
cause the processor to enter SMM mode at any time, even when interrupts 
are disabled and NMIs are masked.  This is deliberately hidden from the 
running code, but it does cause time to elapse, which is visible via the 
TSC and all hardware time counters.  Therefore, you can never get an 
accurate timing in one iteration, and using multiple iterations allows 
you to effectively deal with the same issues you would have if you left 
interrupts enabled.

> This code used to work when ran as root:
>
> void
> main(void)
> {
>         int i;
>         iopl(3);
>         while (1) {
>                 asm volatile("cli");
>                 //              for (i=0; i<20000000; i++)
>                 for (i=0; i<1000000000; i++)
>                         asm volatile("");
>                 asm volatile("sti");
>                 sleep(1);
>         }
> }
>
> ...and was actually useful.
>   

The code you show above can be made to work in a virtual machine, and 
you can allow userspace to disable interrupts and still have a perfectly 
fine solution -- if you restrict the enabling and disabling of 
interrupts in userspace to the cli and sti instructions.  But it does 
not work if you start using nested interrupt control, using pushf and popf.

The virtual machine monitor must always leave hardware interrupts 
enabled, since it must service them without allowing the guest VM to 
interfere.  As such, the actual state of the hardware interrupt flag is 
visible to userspace programs.  CLI and STI get away with this, because 
they are privileged instructions, and as such, they trap when IOPL is 
not present.  But PUSHF and POPF do not.  A POPF instruction which 
changes the interrupt flag behaves differently, depending on the IOPL 
state.  When IOPL is not present, and the POPF would change the state of 
the interrupt flag - nothing happens.  The interrupt flag is not 
changed, but most importantly, it is not a privileged instruction, so it 
does not trap.

Therefore, this instruction is non-virtualizable.  You can not run it 
directly in a virtual machine - you must simulate it.  To simulate it 
requires either straightforward interpretation, hardware virtualization, 
or binary translation.  Therein lies the crux of the problem.  While you 
can allow userspace to enable and disable interrupts using CLI and STI, 
you have no way to simulate its use of the POPF instruction unless you 
use one of these technologies.  This is why we disallow all toggling of 
the interrupt flag from userspace, since one of the design goals of 
paravirtualization is not to change userspace code.

Combined with the above argument that enabling / disabling is really not 
useful for userspace in a virtual machine, we have found that if you 
just completely disallow IOPL'ed userspace to enable and disable 
interrupts, _but_ never issue faults to it if it tries, everything just 
works.  The alternative allows you to get in a state where you can end 
up in a non-virtualizable userspace scenario, which is highly undesirable.

>   
>>     7) IOPL semantics for userspace are changed; although userspace may be
>>        granted port access, it can not affect the interrupt flag.
>>     
See above for the impact on X.  X11 runs perfectly fine in our 
paravirtual VMM.

Nit:  Dropping cc'd persons is probably not a good thing.  Some of the 
people here don't subscribe to LKML in full, and would still like to be 
copied on these messages.  No offense meant or taken.

Zach