From: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx> Sent: Tuesday, October 29, 2024 1:02 AM > > When resuming from hibernation, log any channels that were present > before hibernation but now are gone. > In general, the essential virtual devices configured for a VM, remain > same, before and after the hibernation and its not very common that > some offers are missing. The wording here is a bit jumbled. And let's use consistent terminology. I'd suggest: In general, the boot-time devices configured for a resuming VM should be the same as the devices in the VM at the time of hibernation. It's uncommon for the configuration to have been changed such that offers are missing. Changing the configuration violates the rules for hibernation anyway. > The cleanup of missing channels is not > straight-forward and dependent on individual device driver > functionality and implementation, so it can be added in future as > separate changes. > > Signed-off-by: John Starks <jostarks@xxxxxxxxxxxxx> > Co-developed-by: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx> > Signed-off-by: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx> > Reviewed-by: Easwar Hariharan <eahariha@xxxxxxxxxxxxxxxxxxx> > --- > Changes since v1: > https://lore.kernel.org/all/20241018115811.5530-1-namjain@xxxxxxxxxxxxxxxxxxx/ > * Added Easwar's Reviewed-By tag > * Addressed Saurabh's comments: > * Added a note for missing channel cleanup in comments and commit msg > --- > drivers/hv/vmbus_drv.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c > index bd3fc41dc06b..08214f28694a 100644 > --- a/drivers/hv/vmbus_drv.c > +++ b/drivers/hv/vmbus_drv.c > @@ -2462,6 +2462,7 @@ static int vmbus_bus_suspend(struct device *dev) > > static int vmbus_bus_resume(struct device *dev) > { > + struct vmbus_channel *channel; > struct vmbus_channel_msginfo *msginfo; > size_t msgsize; > int ret; > @@ -2494,6 +2495,22 @@ static int vmbus_bus_resume(struct device *dev) > > vmbus_request_offers(); > > + mutex_lock(&vmbus_connection.channel_mutex); > + list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) { > + if (channel->offermsg.child_relid != INVALID_RELID) > + continue; > + > + /* hvsock channels are not expected to be present. */ > + if (is_hvsock_channel(channel)) > + continue; > + > + pr_err("channel %pUl/%pUl not present after resume.\n", > + &channel->offermsg.offer.if_type, > + &channel->offermsg.offer.if_instance); > + /* ToDo: Cleanup these channels here */ > + } > + mutex_unlock(&vmbus_connection.channel_mutex); > + Dexuan and John have explained how in Azure VMs, there should not be any VFs assigned to the VM at the time of hibernation. So the above check for missing offers does not trigger an error message due to VF offers coming after the all-offers-received message. But what about the case of a VM running on a local Hyper-V? I'm not completely clear, but in that case I don't think any VFs are removed before the hibernation, especially for VM-initiated hibernation. It's a reasonable scenario to later resume that same VM, with the same VF assigned to the VM. Because of the way current code counts the offers, vmbus_bus_resume() waits for the VF to be offered again, and all the channels get correct post-resume relids. But the changes in this patch set break that scenario. Since vmbus_bus_resume() now proceeds before the VF offer arrives, hv_pci_resume() calling vmbus_open() could use the pre-hibernation relid for the VF and break things. Certainly the "not present after resume" error message would be spurious. Maybe the focus here is Azure, and it's tolerable for the local Hyper-V case with a VF to not work pending later fixes. But I thought I'd call out the potential issue (assuming my thinking is correct). Michael > /* Reset the event for the next suspend. */ > reinit_completion(&vmbus_connection.ready_for_suspend_event); > > -- > 2.34.1