Re: [PATCH] KVM: Improvements for task switching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bernhard Kohl wrote:
> Jan Kiszka <jan.kiszka <at> siemens.com> writes:
> 
>> Bernhard Kohl wrote:
>>> NSN's proprietary OS DMX sometimes does task switches.
>>> To get it running in KVM the following changes were necessary:
>>> Interrupt injection only with interrupt flag set.
>>> Linking the tss->prev_task_link to itself removed.
>>> Task linking is required for CALL and GATE.
>>> Do not call skip_emulated_instruction() for GATE.
>> Please post independent changes as separate patches. I guess the task
>> linking changes belong together, but surely not to the IRQ injection
>> patch. And the last change looks independent, too.
> 
> From my point of view it is one patch. The DMX OS crashed during its task
> switch. After fixing the first problem we got the 2nd, then the 3rd and 4th.
> It can only complete a complete task switch with all this fixed. Obviously
> all other guests don't do this kind of task switches.

Let's consider some hypothetic guest that gets unhappy about the 4th
change but would be fine with the other three - in order to find the
origin of the regression more quickly, one needs separate patches that
can be reverted and re-applied one-by-one. Look at this from a higher
POV, not just from your guest's perspective.

> 
>> Another wish (specifically as this is tricky stuff): also describe in
>> the commit log, why you changed something.
> 
> OK, I will do that.
> 
>> That causes concerns on my side as we had a hard time stabilizing this
>> code. Need to think about it. Do you happen to have a test case for this
>> (if it's not publicly shareable, contact me directly)? Did you check
>> that this change causes no obvious regressions to other guests? What
>> about the user-inject IRQ case, does it already work for you as-is?
> 
> The test case is our DMX OS (no public availability). Without these changes it
> crashes the VM.

How did you debug the irq injection bug? Can you explain the scenario
which finally leads to your guest crash?

Normally, some to-be-injected IRQ is marked pending first when the IRQ
window is open and it is then immediately injected. That may fail, the
failure resolution is started, and then the still pending IRQ is
re-injected. I'm interested in that failure, and why the IRQ window
state changed after fixing up. Maybe it is a specific property of your
OS. See, I'm a fan of understanding what went wrong before patching it. :)

> No known other problems. Linux guests run well with these
> changes. Others not tested.

Meanwhile I also think that this particular change should not cause
regressions.

> 
>> What about 16-bit switches, are they already correct?
> 
> Maybe similar changes are needed for 16-bit switches. DMX does not do that.
> So I have no guest to test this.

At least you could try to apply your findings in an analogous way to the
16-bit case. Note in the change log that there is no test case yet and
let us wait for someone else to come around and stress it (which
probably means that we had no user for that use case so far anyway).

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux