On 01/29/2010 09:25 AM, Jan Kiszka wrote:
Anthony Liguori wrote:
On 01/27/2010 08:54 AM, Jan Kiszka wrote:
This patch originates in the mp_state writeback issue: During runtime
and even on reset, we must not write the previously saved VCPU state
back into the kernel in an uncontrolled fashion. E.g mp_state should
only written on reset or on VCPU setup. Certain clocks (e.g. the TSC)
may only be written on setup or after vmload.
By introducing additional information about the context of the planned
vcpu state manipulation, we can simply skip sensitive states like
mp_state when updating the in-kernel state. The planned modifications
are defined when calling cpu_synchronize_state. They accumulate, ie.
once a full writeback was requested, it will stick until it was
performed.
This patch already fixes existing writeback issues in upstream KVM by
only selectively writing MSR_IA32_TSC, MSR_KVM_SYSTEM_TIME,
MSR_KVM_WALL_CLOCK, the mp_state and the vcpu_events.
Signed-off-by: Jan Kiszka<jan.kiszka@xxxxxxxxxxx>
I think the context argument makes the function very difficult to call
correctly.
I'd suggest making CPU_MODIFY_RUNTIME the behaviour of
cpu_synchronize_state. I'd suggest adding another function to handle
init like cpu_init_state(). Likewise, if an explicit reset state is
needed, I think a cpu_init_state_after_reset() makes sense.
I don't quite understand the context that NONE should be used in.
'context' was the wrong term, it should rather be 'scheduled vcpu state
modifications'.
I think the key point though is to handle RUNTIME mostly transparently
since it's the most common case.
This whole topic is complex and requires at least some cooperation from
the users of this API. Previous attempts to make this transparent caused
way too many bugs. E.g. the idea that writeback could simply be handled
on vcpu exec didn't fly, and qemu-kvm demonstrates the result (lots of
kvm hooks, fragile code).
So I'm about to carefully remove some transparency. The key to this is
proper announcement of planned and/or performed changes (abstracted to
those three levels "runtime", "reset", and "init").
I will think about your suggestions. Maybe it makes sense to
(re-)introduce explicit writeback points as generic services, and we
should keep the common case as is (dropping my optimization
CPU_MODIFY_NONE).
I can understand the argument you're making but if you do decide to keep
the context argument, then I'd strongly suggest adding a bucket full of
documentation explaining when each state should be used. At the moment,
it's not clear at first glance.
Regards,
Anthony Liguori
Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html