On Tue, Jun 01, 2021 at 10:18:36AM -0400, Boris Ostrovsky wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On 5/28/21 5:50 PM, Anchal Agarwal wrote: > > > That only fails during boot but not after the control jumps into the image. The > > non boot cpus are brought offline(freeze_secondary_cpus) and then online via cpu hotplug path. In that case xen_vcpu_setup doesn't invokes the hypercall again. > > > OK, that makes sense --- by that time VCPUs have already been registered. What I don't understand though is why resume doesn't fail every time --- xen_vcpu and xen_vcpu_info should be different practically always, shouldn't they? Do you observe successful resumes when the hypercall fails? > > The resume won't fail because in the image the xen_vcpu and xen_vcpu_info are same. These are the same values that got in there during saving of the hibernation image. So whatever xen_vcpu got as a value during boot time registration on resume is essentially lost once the jump into the saved kernel image happens. Interesting part is if KASLR is not enabled boot time vcpup mfn is same as in the image. Once you enable KASLR this value changes sometimes and whenever that happens resume gets stuck. Does that make sense? No it does not resume successfully if hypercall fails because I was trying to explicitly reset vcpu and invoke hypercall. I am just wondering why does restore logic fails to work here or probably I am missing a critical piece here. > > > > Another line of thought is something what kexec does to come around this problem > > is to abuse soft_reset and issue it during syscore_resume or may be before the image get loaded. > > I haven't experimented with that yet as I am assuming there has to be a way to re-register vcpus during resume. > > > Right, that sounds like it should work. > You mean soft reset or re-register vcpu? -Anchal > > -boris > >