From: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx> Sent: Wednesday, November 13, 2024 12:47 AM > > On 11/12/2024 8:43 AM, Michael Kelley wrote: > > From: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx> Sent: Sunday, November 10, 2024 9:44 PM > >> > >> On 11/7/2024 11:14 AM, Naman Jain wrote: > >>> > >>> On 11/1/2024 12:44 AM, Michael Kelley wrote: > >>>> From: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx> Sent: Tuesday, October 29, 2024 1:02 AM > >>>>> > > > > [snip] > > > >>>>> @@ -2494,6 +2495,22 @@ static int vmbus_bus_resume(struct device *dev) > >>>>> > >>>>> vmbus_request_offers(); > >>>>> > >>>>> + mutex_lock(&vmbus_connection.channel_mutex); > >>>>> + list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) { > >>>>> + if (channel->offermsg.child_relid != INVALID_RELID) > >>>>> + continue; > >>>>> + > >>>>> + /* hvsock channels are not expected to be present. */ > >>>>> + if (is_hvsock_channel(channel)) > >>>>> + continue; > >>>>> + > >>>>> + pr_err("channel %pUl/%pUl not present after resume.\n", > >>>>> + &channel->offermsg.offer.if_type, > >>>>> + &channel->offermsg.offer.if_instance); > >>>>> + /* ToDo: Cleanup these channels here */ > >>>>> + } > >>>>> + mutex_unlock(&vmbus_connection.channel_mutex); > >>>>> + > >>>> > >>>> Dexuan and John have explained how in Azure VMs, there should not be > >>>> any VFs assigned to the VM at the time of hibernation. So the above > >>>> check for missing offers does not trigger an error message due to > >>>> VF offers coming after the all-offers-received message. > >>>> > >>>> But what about the case of a VM running on a local Hyper-V? I'm not > >>>> completely clear, but in that case I don't think any VFs are removed > >>>> before the hibernation, especially for VM-initiated hibernation. It's > >>> > >>> I am not sure about this behavior. I have requested Dexuan offline > >>> for a comment. > >>> > >>>> a reasonable scenario to later resume that same VM, with the same > >>>> VF assigned to the VM. Because of the way current code counts > >>>> the offers, vmbus_bus_resume() waits for the VF to be offered again, > >>>> and all the channels get correct post-resume relids. But the changes > >>>> in this patch set break that scenario. Since vmbus_bus_resume() now > >>>> proceeds before the VF offer arrives, hv_pci_resume() calling > >>>> vmbus_open() could use the pre-hibernation relid for the VF and break > >>>> things. Certainly the "not present after resume" error message would > >>>> be spurious. > >>>> > >>>> Maybe the focus here is Azure, and it's tolerable for the local Hyper-V > >>>> case with a VF to not work pending later fixes. But I thought I'd call > >>>> out the potential issue (assuming my thinking is correct). > >>>> > >>>> Michael > >>> > >>> IIUC, below scenarios can happen based on your comment- > >>> > >>> Case 1: > >>> VF channel offer is received in time before hv_pci_resume() and there > >>> are no problems. > >>> > >>> Case 2: > >>> Resume proceeds just after getting ALLOFFERS_DELIVERED msg and a warning > >>> is printed that this VF channel is not present after resume. > >>> Then two scenarios can happen: > >>> Case 2.1: > >>> VF channel offer is received before hv_pci_resume() and things work > >>> fine. Warning printed is spurious. > >>> Case 2.2: > >>> VM channel offer is not received before hv_pci_resume() and relid is > >>> not yet restored by onoffer. This is a problem. Warning is printed in > >>> this case for missing offer. > >>> > >>> I think it all depends on whether or not VFs are removed in local > >>> HyperV VMs. I'll try to get this information. Thanks for pointing this > >>> out. > >>> > >>> Regards, > >>> Naman > >>> > >> > >> Hi Michael, > >> I discussed with Dexuan and we tried these scenarios. Here are the > >> observations: > >> > >> For the two ways of host initiated hibernation: > >> > >> #1: Invoke-Hibernate $vm -Device (Uses the guest shutdown component) > >> OR > >> #2: Invoke-Hibernate $vm -ComputerSystem (Uses the RequestStateChange > >> ability) > > > > Question: What Powershell module provides "Invoke-Hibernate"? It's not > > present on my Windows 11 system that is running Hyper-V, and I can't > > find any documentation about it on the web. Or maybe Invoke-Hibernate > > is a Powershell *script*? > > Powertest is one of the internal packages, which has some commands to > trigger host-initiated hibernation. I should probably have mentioned > that in my original comment. > Makes sense. Since there was no mention of it on the web, I was reaching the conclusion that it must be something internal. There are public postings on how to twiddle power settings to do hibernation, so hopefully I can create my own equivalent. > > > >> > >> #1 does not remove the VF before sending the hibernate message to the VM > >> via hv_utils, but #2 does. > >> > >> With both #1 and #2, during resume, the host offers the vPCI vmbus > >> device before the vmbus_onoffers_delivered() is called. Whether or not > >> VFs are removed doesn't matter here, because during resume the first > >> fresh kernel always requests the VF device, meaning it has become a > >> boot-time device when the 'old' kernel is resuming back. So the issue we > >> are discussing will not happen in practice and the patch won't break > >> things and won't print spurious warnings. If its OK, please let me know, > >> I'll then proceed with v3. > >> > > > > Ah, this is interesting. I'm assuming these are the details: > > > > 1) VM boots with the intent of resuming from hibernation (though > > Hyper-V doesn't know about that intent) > > 2) Original fresh kernel is loaded and begins initialization > > 3) VMBus offers come in for boot-time devices, which excludes SR-IOV VFs. > > 4) ALLOFFERS_DELIVERED message comes in > > 5) The storvsc driver initializes for the virtual disks on the VM > > 6) Kernel initialization code finds and reads the swap space to see if a > > hibernation image is present. If so, it reads in the hibernation image. > > 7) The suspend sequence is initiated (just like during hibernation) > > to shutdown the VMBus devices and terminate the VMBus connection. > > 8) Control is transferred to the previously read-in hibernation image > > 9) The hibernation image runs the resume sequence, which > > initiates a new VMBus connection and requests offers > > 10) VMBus offers come in for whatever VMBus devices were present > > when Step 7 initiated the suspend sequence. If a VF device was present > > at that time, an offer for that VF device will come in and will match up > > with the VF that was present in the VM at the time of hibernation. > > 11) ALLOFFERS_DELIVERED message comes in again for the > > newly initiated VMBus connection. > > > > 3), 4) works differently IMO. There is no request_for_offers, or > ALLOFFERS_DELIVERED for fresh kernel. Otherwise on adding the prints in > kernel, we should have seen these function calls *twice* in one > hibernation-resume cycle. But that is not the case. > > When the older/original kernel boots up, and requests offers, it gets > those VF offers again as part of boot time offers, and then > ALLOFFERS_DELIVERED msg comes. I'm still trying to figure out how fresh > kernel requests for VF offers or if it gets those offers automatically > from the host. I will update my findings so that it can be put up in > documentation which you mentioned. Hmmm. I'm not sure what might be happening. I'll be interested in what you find. I do indeed want to call out the details in my documentation. And I'll also try to repro myself. Michael > > > The netvsc driver gets initialized *after* step 4, but we don't know > > exactly *when* relative to the storvsc driver. The netvsc driver must > > tell Hyper-V that it can handle an SR-IOV VF, and the VF offer is sent > > sometime after that. While this netvsc/VF sequence is happening, the > > storvsc driver is reading the hibernation image from swap (Step 6). > > > > Maybe this is how fresh kernel gets the offers for VF devices. > > > I think the sequence you describe works when reading the > > hibernation image from swap takes 10's of seconds, or even several > > minutes in an Azure VM with a remote disk. That gives plenty > > of time for the VF to get initialized and be fully present when Step 7 > > starts. But there's no *guarantee* that the VF is initialized by then. > > It's also not clear to me what action by the guest causes Hyper-V to > > treat the VF as "added to the VM" so that in Step 10 the VF offer is > > sent before ALLOFFERS_DELIVERED. > > > > The sequence you describe also happens in an Azure VM, even if > > the VF is removed before hibernation. When the VF offer arrives > > during Step 10, it doesn't match with any VFs that were in the VM > > at the time of hibernation. It's treated as a new device, just like it > > would be if the offer arrived after ALLOFFERS_DELIVERED. > > > > But it seems like there's still the risk of having a fast swap disk > > and a small hibernation image that can be read in a shorter amount > > of time than it takes to initialize the VF to the point that Hyper-V > > treats it as added to the VM. Without knowing what that point is, > > it's hard to assess the likelihood of that happening. Or maybe there's > > an interlock I'm not aware of that ensures Step 7 can't proceed > > while the netvsc/VF sequence is in progress. > > > > So maybe it's best to proceed with this patch, and deal with the > > risk later when/if it becomes reality. I'm OK if you want to do > > that. This has been an interesting discussion that I'll try to capture > > in some high-level documentation about how Linux guests on > > Hyper-V do hibernation! > > > > Michael > > > > I have sent v3 with the changes we discussed. > > Regards, > Naman