Re: [PATCH v4 3/4] Implement the core API to suspend/resume the host

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 28, 2011 at 05:33:22PM +0530, Srivatsa S. Bhat wrote:
> Add the core functions that implement the functionality of the API.
> Suspend is done by using an asynchronous mechanism so that we can return
> the status to the caller before the host gets suspended. This asynchronous
> operation is achieved by suspending the host in a separate thread of
> execution. However, returning the status to the caller is only best-effort,
> but not guaranteed.
> 
> To resume the host, an RTC alarm is set up (based on how long we want to
> suspend) before suspending the host. When this alarm fires, the host gets
> woken up.
> 
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx>
> ---
> 
>  src/libvirt_private.syms |    7 +
>  src/nodeinfo.c           |  245 ++++++++++++++++++++++++++++++++++++++++++++++
>  src/nodeinfo.h           |    9 ++
>  src/qemu/qemu_driver.c   |    4 +
>  4 files changed, 265 insertions(+), 0 deletions(-)

The nodeinfo.c file were a place to put APIs for collecting info about
the host CPU & memory stats. As such the suspend code does not really
belong there.

Please put it in two new files  src/util/virsuspend.[c,h]

and ensure all the internal APIs related to this have the name
prefix 'virSuspend' to match the filename...

> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index 2cf50d3..5a83e62 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -803,6 +803,13 @@ nodeGetCellsFreeMemory;
>  nodeGetFreeMemory;
>  nodeGetInfo;
>  nodeGetMemoryStats;
> +nodeSuspend;
> +nodeSuspendForDuration;
> +setNodeWakeup;

eg, these three would need renaming

> +virSuspendCleanup;
> +virSuspendInit;
> +virSuspendLock;
> +virSuspendUnlock;

I'm not really sure why we need to export all these. AFAICT,
you are only calling  virSuspendCleanup and nodeSuspendForDuration
from the QEMU driver. So all the other APIs could just be declared
static and thus not listed here

>  
>  
>  # nwfilter_conf.h
> diff --git a/src/nodeinfo.c b/src/nodeinfo.c
> index 6448b79..3f09a51 100644
> --- a/src/nodeinfo.c
> +++ b/src/nodeinfo.c
> @@ -46,6 +46,9 @@
>  #include "count-one-bits.h"
>  #include "intprops.h"
>  #include "virfile.h"
> +#include "command.h"
> +#include "threads.h"
> +#include "datatypes.h"
>  
>  
>  #define VIR_FROM_THIS VIR_FROM_NONE
> @@ -65,6 +68,11 @@
>  # define LINUX_NB_MEMORY_STATS_ALL 4
>  # define LINUX_NB_MEMORY_STATS_CELL 2
>  
> +/* Bitmask to hold the Power Management features supported by the host,
> + * such as Suspend-to-RAM (S3), Suspend-to-Disk (S4), Hybrid-Suspend etc.
> + */
> +static unsigned int hostPMFeatures;
> +
>  /* NB, this is not static as we need to call it from the testsuite */
>  int linuxNodeInfoCPUPopulate(FILE *cpuinfo,
>                               virNodeInfoPtr nodeinfo,
> @@ -897,3 +905,240 @@ unsigned long long nodeGetFreeMemory(virConnectPtr conn ATTRIBUTE_UNUSED)
>      return 0;
>  }
>  #endif
> +
> +
> +static int initialized;
> +virMutex virSuspendMutex;
> +
> +static void virSuspendLock(void)
> +{
> +    virMutexLock(&virSuspendMutex);
> +}
> +
> +static void virSuspendUnlock(void)
> +{
> +    virMutexUnlock(&virSuspendMutex);
> +}
> +
> +/**
> + * virSuspendInit:
> + *
> + * Get the system-wide sleep states supported by the host, such as
> + * Suspend-to-RAM (S3), Suspend-to-Disk (S4), or Hybrid-Suspend, so that
> + * a request to suspend/hibernate the host can be handled appropriately
> + * based on this information.
> + *
> + * Returns 0 if successful, and -1 in case of error.
> + */
> +static int virSuspendInit(void)
> +{
> +
> +    if (virMutexInit(&virSuspendMutex) < 0)
> +        return -1;
> +
> +    /* Get the power management capabilities supported by the host.
> +     * Ensure that this is done only once, by using the 'initialized'
> +     * variable.
> +     */
> +    if (virGetPMCapabilities(&hostPMFeatures) < 0) {
> +        VIR_ERROR(_("Failed to get host power management features"));
> +        return -1;
> +    }
> +
> +    return 0;
> +}
> +
> +
> +static bool aboutToSuspend;
> +
> +/**
> + * nodeSuspendForDuration:
> + * @conn: pointer to the hypervisor connection
> + * @state: the state to which the host must be suspended to -
> + *         VIR_NODE_S3             (Suspend-to-RAM),
> + *         VIR_NODE_S4             (Suspend-to-Disk),
> + *         VIR_NODE_HYBRID_SUSPEND (Hybrid-Suspend)
> + * @duration: the time duration in seconds, for which the host must be
> + *            suspended
> + * @flags: any flag values that might need to be passed; currently unused.
> + *
> + * Suspend the node (host machine) for the given duration of time in the
> + * specified state (such as S3, S4 or Hybrid-Suspend). Attempt to resume
> + * the node after the time duration is complete.
> + *
> + * First, an RTC alarm is set appropriately to wake up the node from its
> + * sleep state. Then the actual node suspend operation is carried out
> + * asynchronously in another thread, after a short time delay so as to
> + * give enough time for this function to return status to its caller
> + * through the connection. However this returning of status is best-effort
> + * only, and should generally happen due to the short delay but might be
> + * missed if the machine suspends first.
> + *
> + * Returns 0 in case the node is going to be suspended after a short delay,
> + * -1 if suspending the node is not supported, or if a previous suspend
> + * operation is still in progress.
> + */
> +int nodeSuspendForDuration(virConnectPtr conn ATTRIBUTE_UNUSED,
> +                           int state,
> +                           unsigned long long duration,
> +                           unsigned int flags ATTRIBUTE_UNUSED)
> +{
> +    static virThread thread;
> +    char *cmdString = NULL;
> +
> +    if (!initialized) {
> +        if (virSuspendInit() < 0)
> +            return -1;
> +        initialized = 1;
> +    }

This is a designed in race condition. You should instead be calling
virSuspendInit() from virInitialize() as we do for other initializers.

> +
> +    /*
> +     * Ensure that we are the only ones trying to suspend.
> +     * Fail if somebody has already initiated a suspend.
> +     */
> +    virSuspendLock();
> +
> +    if (aboutToSuspend) {
> +        /* A suspend operation is already in progress */
> +        virSuspendUnlock();
> +        return -1;
> +    } else
> +        aboutToSuspend = true;

Coding style requires  {} around the else clause too.

> +
> +    virSuspendUnlock();
> +
> +    /* Check if the host supports the requested suspend state */
> +    switch (state) {
> +    case VIR_NODE_S3:
> +        if (hostPMFeatures & VIR_NODE_S3) {
> +            cmdString = strdup("pm-suspend");
> +            if (cmdString == NULL) {
> +                virReportOOMError();
> +                goto cleanup;
> +            }
> +            break;
> +        }
> +        goto cleanup;

Needs to report an error, VIR_ERR_OPERATION_UNSUPPORTED

> +
> +    case VIR_NODE_S4:
> +        if (hostPMFeatures & VIR_NODE_S4) {
> +            cmdString = strdup("pm-hibernate");
> +            if (cmdString == NULL) {
> +                virReportOOMError();
> +                goto cleanup;
> +            }
> +            break;
> +        }
> +        goto cleanup;

Needs to report an error, VIR_ERR_OPERATION_UNSUPPORTED

> +
> +    case VIR_NODE_HYBRID_SUSPEND:
> +        if (hostPMFeatures & VIR_NODE_HYBRID_SUSPEND) {
> +            cmdString = strdup("pm-suspend-hybrid");
> +            if (cmdString == NULL) {
> +                virReportOOMError();
> +                goto cleanup;
> +            }
> +            break;
> +        }
> +        goto cleanup;

Needs to report an error, VIR_ERR_OPERATION_UNSUPPORTED

> +
> +    default:
> +        goto cleanup;

Needs to report an error, VIR_ERR_INVALID_ARG

> +    }
> +
> +    /* Just set the RTC alarm. Don't suspend yet. */
> +    if (setNodeWakeup(duration) < 0) {
> +        nodeReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                        _("Failed to set up the RTC alarm for node wakeup\n"));

This overwrites errors already raised in setNodeWakeup

> +        goto cleanup;
> +    }
> +
> +    if (virThreadCreate(&thread, false, nodeSuspend, (void *)cmdString) < 0) {
> +        nodeReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                        _("Failed to create thread to suspend the host\n"));
> +        goto cleanup;
> +    }
> +
> +    return 0;
> +
> +cleanup:
> +    VIR_FREE(cmdString);
> +    return -1;
> +}
> +
> +
> +# define SUSPEND_DELAY 10 /* in seconds */
> +
> +/**
> + * setNodeWakeup:
> + * @alarmTime: time in seconds from now, at which the RTC alarm has to be set.
> + *
> + * Set up the RTC alarm to the specified time.
> + * Return 0 on success, -1 on failure.
> + */
> +int setNodeWakeup(unsigned long long alarmTime)
> +{
> +    virCommandPtr setAlarmCmd;
> +    int ret = -1;
> +
> +    if (alarmTime <= SUSPEND_DELAY)
> +        return -1;

Needs to report an error

> +
> +    setAlarmCmd = virCommandNewArgList("rtcwake", "-m", "no", "-s", NULL);
> +    virCommandAddArgFormat(setAlarmCmd, "%lld", alarmTime);
> +
> +    if (virCommandRun(setAlarmCmd, NULL) < 0) {
> +        nodeReportError(VIR_ERR_INTERNAL_ERROR,
> +                        "%s", _("Failed to set up the RTC alarm\n"));
> +        goto cleanup;
> +    }

This overwrites an error already reported by virCommandRun

> +
> +    ret = 0;
> +
> +cleanup:
> +    virCommandFree(setAlarmCmd);
> +    return ret;
> +}
> +
> +
> +/**
> + * nodeSuspend:
> + * @cmdString: pointer to the command string this thread has to execute.
> + *
> + * Actually perform the suspend operation by invoking the command.
> + * Give a short delay before executing the command so as to give a chance
> + * to virNodeSuspendForDuration() to return the status to the caller.
> + * If we don't give this delay, that function will not be able to return
> + * the status, since the suspend operation would have begun and hence no
> + * data can be sent through the connection to the caller. However, with
> + * this delay added, the status return is best-effort only.
> + */
> +void nodeSuspend(void *cmdString)
> +{
> +    virCommandPtr suspendCmd = virCommandNew((char *)cmdString);
> +
> +    VIR_FREE(cmdString);
> +
> +    /* Delay for sometime so that the function nodeSuspendForDuration()
> +     * can return the status to the caller.
> +     */
> +    sleep(SUSPEND_DELAY);
> +    if (virCommandRun(suspendCmd, NULL) < 0) {
> +        nodeReportError(VIR_ERR_INTERNAL_ERROR,
> +                        "%s", _("Failed to suspend the node\n"));
> +    }

Overwrites the error already reported

> +
> +    virCommandFree(suspendCmd);
> +
> +    /* Now that we have resumed from suspend, reset 'aboutToSuspend' flag. */
> +    virSuspendLock();
> +    aboutToSuspend = false;
> +    virSuspendUnlock();
> +}
> +
> +void virSuspendCleanup(void)
> +{
> +    if (initialized)
> +        virMutexDestroy(&virSuspendMutex);
> +}

This is not good. It is assuming that the QEMU driver is the only
driver that will ever use this. It should be implemented by the Xen,
LXC, UML drivers too. For static initialized data, we generally do
not ever destroy it, so IMHO this method can just be deleted.

> +
> diff --git a/src/nodeinfo.h b/src/nodeinfo.h
> index 4766152..8ec887d 100644
> --- a/src/nodeinfo.h
> +++ b/src/nodeinfo.h
> @@ -46,4 +46,13 @@ int nodeGetCellsFreeMemory(virConnectPtr conn,
>                             int maxCells);
>  unsigned long long nodeGetFreeMemory(virConnectPtr conn);
>  
> +int nodeSuspendForDuration(virConnectPtr conn,
> +                           int state,
> +                           unsigned long long duration,
> +                           unsigned int flags);
> +
> +int setNodeWakeup(unsigned long long alarmTime);
> +void nodeSuspend(void *cmdString);
> +void virSuspendCleanup(void);
> +
>  #endif /* __VIR_NODEINFO_H__*/
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index fe2ab85..9cee95e 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -777,6 +777,9 @@ qemudShutdown(void) {
>  
>      virSysinfoDefFree(qemu_driver->hostsysinfo);
>  
> +    /* Cleanup the structures initialized for suspending the host */
> +    virSuspendCleanup();
> +
>      qemuProcessAutoDestroyShutdown(qemu_driver);
>  
>      VIR_FREE(qemu_driver->configDir);
> @@ -10918,6 +10921,7 @@ static virDriver qemuDriver = {
>      .domainGetBlockJobInfo = qemuDomainGetBlockJobInfo, /* 0.9.4 */
>      .domainBlockJobSetSpeed = qemuDomainBlockJobSetSpeed, /* 0.9.4 */
>      .domainBlockPull = qemuDomainBlockPull, /* 0.9.4 */
> +    .nodeSuspendForDuration = nodeSuspendForDuration, /* 0.9.8 */
>  };

Since this is using a shared helper, you can trivially add this to the
UML, LXC and Xen drivers too, which all run inside libvirtd.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list


[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]