Kohl, Bernhard (NSN - DE/Munich) wrote: > Jan Kiszka Wrote: >> Bernhard Kohl wrote: >>> Jan Kiszka <jan.kiszka <at> siemens.com> writes: >>> >>>> Bernhard Kohl wrote: >>>>> NSN's proprietary OS DMX sometimes does task switches. >>>>> To get it running in KVM the following changes were necessary: >>>>> Interrupt injection only with interrupt flag set. >>>>> Linking the tss->prev_task_link to itself removed. >>>>> Task linking is required for CALL and GATE. >>>>> Do not call skip_emulated_instruction() for GATE. >>>> Please post independent changes as separate patches. I >> guess the task >>>> linking changes belong together, but surely not to the IRQ >> injection >>>> patch. And the last change looks independent, too. >>> From my point of view it is one patch. The DMX OS crashed >> during its task >>> switch. After fixing the first problem we got the 2nd, then >> the 3rd and 4th. >>> It can only complete a complete task switch with all this >> fixed. Obviously >>> all other guests don't do this kind of task switches. >> Let's consider some hypothetic guest that gets unhappy about the 4th >> change but would be fine with the other three - in order to find the >> origin of the regression more quickly, one needs separate patches that >> can be reverted and re-applied one-by-one. Look at this from a higher >> POV, not just from your guest's perspective. > > OK, after the discussion has finished, I will submit separate patches. > >>>> Another wish (specifically as this is tricky stuff): also >> describe in >>>> the commit log, why you changed something. >>> OK, I will do that. >>> >>>> That causes concerns on my side as we had a hard time >> stabilizing this >>>> code. Need to think about it. Do you happen to have a test >> case for this >>>> (if it's not publicly shareable, contact me directly)? Did >> you check >>>> that this change causes no obvious regressions to other >> guests? What >>>> about the user-inject IRQ case, does it already work for you as-is? >>> The test case is our DMX OS (no public availability). >> Without these changes it >>> crashes the VM. >> How did you debug the irq injection bug? Can you explain the scenario >> which finally leads to your guest crash? > > Actually my colleague Thomas did the debugging. Thomas, please describe > the details! > >> Normally, some to-be-injected IRQ is marked pending first when the IRQ >> window is open and it is then immediately injected. That may fail, the >> failure resolution is started, and then the still pending IRQ is >> re-injected. I'm interested in that failure, and why the IRQ window >> state changed after fixing up. Maybe it is a specific property of your >> OS. See, I'm a fan of understanding what went wrong before >> patching it. :) >> >>> No known other problems. Linux guests run well with these >>> changes. Others not tested. >> Meanwhile I also think that this particular change should not cause >> regressions. >> >>>> What about 16-bit switches, are they already correct? >>> Maybe similar changes are needed for 16-bit switches. DMX >> does not do that. >>> So I have no guest to test this. >> At least you could try to apply your findings in an analogous >> way to the >> 16-bit case. Note in the change log that there is no test case yet and >> let us wait for someone else to come around and stress it (which >> probably means that we had no user for that use case so far anyway). > > Thomas, can you do that? I'm on vacation next week. After that I will > post the final result as a new patch set. > Great, thanks in advance! Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html