Re: [PATCH v2 2/2] Drivers: hv: vmbus: Log on missing offers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 11/13/2024 8:56 PM, Michael Kelley wrote:
From: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx> Sent: Wednesday, November 13, 2024 12:47 AM

On 11/12/2024 8:43 AM, Michael Kelley wrote:
From: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx> Sent: Sunday, November 10, 2024 9:44 PM

On 11/7/2024 11:14 AM, Naman Jain wrote:

On 11/1/2024 12:44 AM, Michael Kelley wrote:
From: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx> Sent: Tuesday, October 29, 2024 1:02 AM


[snip]

<snip>


1)  VM boots with the intent of resuming from hibernation (though
Hyper-V doesn't know about that intent)
2)  Original fresh kernel is loaded and begins initialization
3)  VMBus offers come in for boot-time devices, which excludes SR-IOV VFs.
4)  ALLOFFERS_DELIVERED message comes in
5)  The storvsc driver initializes for the virtual disks on the VM
6)  Kernel initialization code finds and reads the swap space to see if a
hibernation image is present. If so, it reads in the hibernation image.
7)  The suspend sequence is initiated (just like during hibernation)
to shutdown the VMBus devices and terminate the VMBus connection.
8)  Control is transferred to the previously read-in hibernation image
9)  The hibernation image runs the resume sequence, which
initiates a new VMBus connection and requests offers
10) VMBus offers come in for whatever VMBus devices were present
when Step 7 initiated the suspend sequence. If a VF device was present
at that time, an offer for that VF device will come in and will match up
with the VF that was present in the VM at the time of hibernation.
11) ALLOFFERS_DELIVERED message comes in again for the
newly initiated VMBus connection.


3), 4) works differently IMO. There is no request_for_offers, or
ALLOFFERS_DELIVERED for fresh kernel. Otherwise on adding the prints in
kernel, we should have seen these function calls *twice* in one
hibernation-resume cycle. But that is not the case.


I was looking at the wrong place for fresh kernel logs. The sequence you
mentioned is indeed correct and aligns to my understanding and
experiments results. Kindly ignore my comment above.

When the older/original kernel boots up, and requests offers, it gets
those VF offers again as part of boot time offers, and then
ALLOFFERS_DELIVERED msg comes. I'm still trying to figure out how fresh
kernel requests for VF offers or if it gets those offers automatically
from the host. I will update my findings so that it can be put up in
documentation which you mentioned.

Fresh kernel does not seem to be getting these VF channel offers automatically, but resuming kernel does, when it calls request_for_offers().


Regards,
Naman


Hmmm. I'm not sure what might be happening. I'll be interested in
what you find. I do indeed want to call out the details in my
documentation. And I'll also try to repro myself.

Michael


The netvsc driver gets initialized *after* step 4, but we don't know
exactly *when* relative to the storvsc driver. The netvsc driver must
tell Hyper-V that it can handle an SR-IOV VF, and the VF offer is sent
sometime after that. While this netvsc/VF sequence is happening, the
storvsc driver is reading the hibernation image from swap (Step 6).


Maybe this is how fresh kernel gets the offers for VF devices.

I think the sequence you describe works when reading the
hibernation image from swap takes 10's of seconds, or even several
minutes in an Azure VM with a remote disk. That gives plenty
of time for the VF to get initialized and be fully present when Step 7
starts. But there's no *guarantee* that the VF is initialized by then.
It's also not clear to me what action by the guest causes Hyper-V to
treat the VF as "added to the VM" so that in Step 10 the VF offer is
sent before ALLOFFERS_DELIVERED.

The sequence you describe also happens in an Azure VM, even if
the VF is removed before hibernation. When the VF offer arrives
during Step 10, it doesn't match with any VFs that were in the VM
at the time of hibernation. It's treated as a new device, just like it
would be if the offer arrived after ALLOFFERS_DELIVERED.

But it seems like there's still the risk of having a fast swap disk
and a small hibernation image that can be read in a shorter amount
of time than it takes to initialize the VF to the point that Hyper-V
treats it as added to the VM. Without knowing what that point is,
it's hard to assess the likelihood of that happening. Or maybe there's
an interlock I'm not aware of that ensures Step 7 can't proceed
while the netvsc/VF sequence is in progress.

So maybe it's best to proceed with this patch, and deal with the
risk later when/if it becomes reality. I'm OK if you want to do
that. This has been an interesting discussion that I'll try to capture
in some high-level documentation about how Linux guests on
Hyper-V do hibernation!

Michael



I have sent v3 with the changes we discussed.

Regards,
Naman





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux