Nadav Amit <nadav.amit@xxxxxxxxx> writes: > Avi Kivity <avi.kivity@xxxxxxxxx> wrote: > >> On 04/09/2015 09:21 PM, Nadav Amit wrote: >>> Bandan Das <bsd@xxxxxxxxxx> wrote: >>> >>>> Nadav Amit <nadav.amit@xxxxxxxxx> writes: >>>> >>>>> Jan Kiszka <jan.kiszka@xxxxxxxxxxx> wrote: >>>>> >>>>>> On 2015-04-08 19:40, Nadav Amit wrote: >>>>>>> Jan Kiszka <jan.kiszka@xxxxxxxxxxx> wrote: >>>>>>> >>>>>>>> On 2015-04-08 18:59, Nadav Amit wrote: >>>>>>>>> Jan Kiszka <jan.kiszka@xxxxxxxxxxx> wrote: >>>>>>>>> >>>>>>>>>> On 2015-04-08 18:40, Nadav Amit wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I would appreciate if someone explains the reason for enabling LINT0 during >>>>>>>>>>> APIC reset. This does not correspond with Intel SDM Figure 10-8: “Local >>>>>>>>>>> Vector Table” that says all LVT registers are reset to 0x10000. >>>>>>>>>>> >>>>>>>>>>> In kvm_lapic_reset, I see: >>>>>>>>>>> >>>>>>>>>>> apic_set_reg(apic, APIC_LVT0, >>>>>>>>>>> SET_APIC_DELIVERY_MODE(0, APIC_MODE_EXTINT)); >>>>>>>>>>> >>>>>>>>>>> Which is actually pretty similar to QEMU’s apic_reset_common: >>>>>>>>>>> >>>>>>>>>>> if (bsp) { >>>>>>>>>>> /* >>>>>>>>>>> * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization >>>>>>>>>>> * time typically by BIOS, so PIC interrupt can be delivered to the >>>>>>>>>>> * processor when local APIC is enabled. >>>>>>>>>>> */ >>>>>>>>>>> s->lvt[APIC_LVT_LINT0] = 0x700; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Yet, in both cases, I miss the point - if it is typically done by the BIOS, >>>>>>>>>>> why does QEMU or KVM enable it? >>>>>>>>>>> >>>>>>>>>>> BTW: KVM seems to run fine without it, and I think setting it causes me >>>>>>>>>>> problems in certain cases. >>>>>>>>>> I suspect it has some historic BIOS backgrounds. Already tried to find >>>>>>>>>> more information in the git logs of both code bases? Or something that >>>>>>>>>> indicates of SeaBIOS or BochsBIOS once didn't do this initialization? >>>>>>>>> Thanks. I found no indication of such thing. >>>>>>>>> >>>>>>>>> QEMU’s commit message (0e21e12bb311c4c1095d0269dc2ef81196ccb60a) says: >>>>>>>>> >>>>>>>>> Don't route PIC interrupts through the local APIC if the local APIC >>>>>>>>> config says so. By Ari Kivity. >>>>>>>>> >>>>>>>>> Maybe Avi Kivity knows this guy. >>>>>>>> ths? That should have been Thiemo Seufer (IIRC), but he just committed >>>>>>>> the code back then (and is no longer with us, sadly). >>>>>>> Oh… I am sorry - I didn’t know about that.. (I tried to make an unfunny joke >>>>>>> about Avi knowing “Ari”). >>>>>> Ah. No problem. My brain apparently fixed that typo up unnoticed. >>>>>> >>>>>>>> But if that commit went in without any BIOS changes around it, QEMU >>>>>>>> simply had to do the job of the latter to keep things working. >>>>>>> So should I leave it as is? Can I at least disable in KVM during INIT (and >>>>>>> leave it as is for RESET)? >>>>>> No, I don't think there is a need to leave this inaccurate for QEMU if >>>>>> our included BIOS gets it right. I don't know what the backward >>>>>> bug-compatibility of KVM is, though. Maybe you can identify since when >>>>>> our BIOS is fine so that we can discuss time frames. >>>>> I think that it was addressed in commit >>>>> 19c1a7692bf65fc40e56f93ad00cc3eefaad22a4 ("Initialize the LINT LVTs on the >>>>> local APIC of the BSP.”) So it should be included in seabios 0.5.0, which >>>>> means qemu 0.12 - so we are talking about the end of 2009 or start of 2010. >>>> The probability that someone will use a newer version of kernel with something >>>> as old as 0.12 is probably minimal. I think it's ok to change it with a comment >>>> indicating the reason. To be on the safe side, however, a user changeable switch >>>> is something worth considering. >>> I don’t see any existing mechanism for KVM to be aware of its user type and >>> version. I do see another case of KVM hacks that are intended for fixing >>> very old QEMU bugs (see 3a624e29c75 changes in vmx_set_segment, which are >>> from pretty much the same time-frame of the issue I try to fix). >>> >>> Since this is something which would follow around, please advise what would >>> be the format. A new ioctl that would supply the userspace “type” (according >>> to predefined constants) and version? >> >> That would be madness. KVM shouldn't even know that qemu exists, let alone >> track its versions. >> >> Simply add a new toggle KVM_USE_STANDARD_LAPIC_LVT_INIT and document that >> userspace MUST use it. Old userspace won't, and will get the old buggy >> behavior. > > I fully agree it would be madness. Yet it appears to be a recurring problem. > Here are similar problems found from a short search: > > 1. vmx_set_segment setting segment accessed (3a624e29c75) > 2. svm_set_cr0 clearing CD and NW (709ddebf81c) > 3. Limited number of MTRRs due to Seabios bus (0d234daf7e0a) > > Excluding (1) all of the other issues are related to the VM BIOS. Perhaps > KVM should somehow realize which VM BIOS runs? (yes, it sounds just as bad.) How about renaming the toggle Avi mentioned above to something more generic (KVM_DISABLE_LEGACY_QUIRKS ?) and grouping all the issues together ? Modern userspace will always enable it and get the new correct behavior. When more cases are discovered, KVM can just add them to the list. > Nadav > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html