On 2016-02-08 18:38, Bruce Rogers wrote: >>>> On 2/8/2016 at 10:27 AM, Bruce Rogers wrote: >>>>> On 2/8/2016 at 09:40 AM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >> >>> >>> On 08/02/2016 17:33, Bruce Rogers wrote: >>>>>>>> >>>>>>>> KVM_MP_STATE_INIT_RECEIVED is what Intel calls the "wait for SIPI" >>>>>>>> state. The BSP never gets a SIPI, it goes straight to 0xFFFFFFF0 >>>>>>>> instead. Can you explain the problem more in detail? >>>>>> >>>>>> I suspect this is about sending INIT-SIPI from another CPU, directed to >>>>>> the BSP, isn't it? We may have to differentiate between CPU (including >>>>>> system) reset and that IPI case. >>>> That is correct. In looking over the KVM code which deals with BSP, this was >>>> the only place which seemed wrong to me wrt special casing for BSP outside >>> the >>>> context of initial system initialization / reset. As far as I understand the >>>> BSP shouldn't be treated differently in this case. >>> >>> See 8.4.2 of the SDM: >>> >>> If the MP protocol has completed and a BSP is chosen, subsequent INITs >>> (either to a specific processor or system wide) do not cause the MP >>> protocol to be repeated. Instead, each logical processor examines its >>> BSP flag (in the IA32_APIC_BASE MSR) to determine whether it should >>> execute the BIOS boot-strap code (if it is the BSP) or enter a >>> wait-for-SIPI state (if it is an AP). >>> >>> So it is correct to treat the BSP differently here, I think. >> >> I had read that, but I though this was speaking from the perspective of the >> SMP aware BIOS code only. In other words, the BIOS would sidetrack AP's >> (based on BSP flag not being present), while BSP would be allowed to go >> through >> the regular BIOS code, checking for reset case, etc. An OS on the other hand >> would be free to treat all x86 processors equally, once it has booted into >> fully symmetrical mode. >> I certainly could be wrong about my above interpretation, but with these >> changes I'm proposing, things work well for the test case of manually >> onlining >> the BSP after the crash kernel has been started (via kexec -e on a AP >> processor >> with maxcpus=1 on the crash kernel command line). From looking through the >> kernel git history it appears this sequence of events was explicitly >> supported >> quite a while ago, and we've got a customer who uses this for fast recovery >> from >> a guest kernel crash. >> >> Bruce > > I mean kexec - p ... above, not kexec -e. Sorry about that. How does real HW behave with your kexec case? Did you try this? Jan
Attachment:
signature.asc
Description: OpenPGP digital signature