On 10/21/2024 10:27 AM, Saurabh Singh Sengar wrote:
On Fri, Oct 18, 2024 at 04:58:10AM -0700, Naman Jain wrote:
Channels offers are requested during vmbus initialization and resume
Nit: s/vmbus/VMBus
Thanks for reviewing Saurabh.
Noted this for next patch. Thanks
from hibernation. Add support to wait for all channel offers to be
delivered and processed before returning from vmbus_request_offers.
This is to support user mode (VTL0) in OpenHCL (A Linux based
paravisor for Confidential VMs) to ensure that all channel offers
are present when init begins in VTL0, and it knows which channels
the host has offered and which it has not.
Usermode isn't necessarily of VTL0, and this issue was actually identified
at a higher VTL in OpenHCL. However, this change isn't specific to OpenHCL,
but is intended for general use. I would prefer if the commit message were
either more generic or precisely aligned with the specific issue it's
addressing.
I'll make it generic.
This is in analogy to a PCI bus not returning from probe until it has
scanned all devices on the bus.
Without this, user mode can race with vmbus initialization and miss
channel offers. User mode has no way to work around this other than
sleeping for a while, since there is no way to know when vmbus has
finished processing offers.
With this added functionality, remove earlier logic which keeps track
of count of offered channels post resume from hibernation. Once all
offers delivered message is received, no further offers are going to
be received. Consequently, logic to prevent suspend from happening
after previous resume had missing offers, is also removed.
Co-developed-by: John Starks <jostarks@xxxxxxxxxxxxx>
Signed-off-by: John Starks <jostarks@xxxxxxxxxxxxx>
Signed-off-by: Naman Jain <namjain@xxxxxxxxxxxxxxxxxxx>
---
drivers/hv/channel_mgmt.c | 38 +++++++++++++++++++++++---------------
drivers/hv/connection.c | 4 ++--
drivers/hv/hyperv_vmbus.h | 14 +++-----------
<..>
}
+ /* Wait for the host to send all offers. */
+ while (wait_for_completion_timeout(
+ &vmbus_connection.all_offers_delivered_event, msecs_to_jiffies(10 * 1000)) == 0) {
Nit: Can simply put 10000 instead of 10*1000
Noted.
+ pr_warn("timed out waiting for all offers to be delivered...\n");
I know we are moving from async to sync, so earlier we never checked this.
But what if some channel timed out do we want to handle this case ? Or put
a comment why this is OK.
We could set error from here as well, but I see vmbus_request_offers return value
is never checked.
It seems to be a best effort way to get all the offers. If its not
received in time, I think we can print a warning and continue. I can add
that to a comment.
+ }
+
+ /*
+ * Flush handling of offer messages (which may initiate work on
+ * other work queues).
+ */
Regards,
Naman