On 03/13/2017 06:29 AM, Michal Privoznik wrote:
I am not able to test this properly (my host has "just" 32GiB of RAM), but I've
patched qemu to insert some delay into its init process and it worked just
fine.
Damn, I should have scanned the list today before also working on this problem.
Attached is a patch I used to test the concept. It's a hack, but in the end
calculates the same timeout as your series. It worked well in my testing.
Michal Privoznik (2):
virTimeBackOffWait: Avoid long periods of sleep
qemu: Adaptive timeout for connecting to monitor
Yep, I like these changes over the hack :-).
Regards,
Jim
src/qemu/qemu_capabilities.c | 2 +-
src/qemu/qemu_monitor.c | 36 +++++++++++++++++++++++++++++++-----
src/qemu/qemu_monitor.h | 1 +
src/qemu/qemu_process.c | 8 ++++++++
src/util/virtime.c | 14 ++++++++++++--
tests/qemumonitortestutils.c | 1 +
6 files changed, 54 insertions(+), 8 deletions(-)
>From df3a630d592c9c3e2f1c4d05cbc9e6423b1f2302 Mon Sep 17 00:00:00 2001
From: Jim Fehlig <jfehlig@xxxxxxxx>
Date: Tue, 14 Mar 2017 15:27:47 -0600
Subject: [PATCH] qemu: scale monitor timeout based on VM memory size
Large memory VMs backed by 1G huge pages can result in monitor
timeouts due to lengthy time spent in the kernel zero'ing pages.
E.g. pre-allocating 402GB worth of 1G hugetlbfs pages can well
exceed the current 30 second timeout
real 105.47
user 0.05
sys 105.42
Instead of simply bumping the timeout and receiving another report
in the future when someone pre-allocates 402TB, scale the timeout
per-VM based on the configured memory size. If the scaled timeout
is less than 30 seconds, the current 30 second timeout is retained.
The simple heuristic in this patch adds 5 seconds to the timeout
for each 5G of memory. On one test machine it was observed to take
~1.5 seconds to pre-allocate 5G of memory. The allocation time
appears to be linear as well. E.g. it took ~5 seconds to pre-allocate
20G of memory.
Signed-off-by: Jim Fehlig <jfehlig@xxxxxxxx>
---
src/qemu/qemu_domain.c | 14 ++++++++++++++
src/qemu/qemu_domain.h | 4 ++++
src/qemu/qemu_monitor.c | 7 ++++---
3 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 07ce22417..f56548078 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -5737,6 +5737,20 @@ qemuDomainGetMonitor(virDomainObjPtr vm)
}
+int
+qemuDomainGetMonitorTimeout(virDomainDefPtr def)
+{
+ int timeout = 30;
+
+ if (def == NULL)
+ return timeout;
+
+ timeout = 5 * (def->mem.cur_balloon / (5 * 1024 * 1024));
+
+ return MAX(timeout, 30);
+}
+
+
/**
* qemuDomainSupportsBlockJobs:
* @vm: domain object
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index c646828e6..ded3a1957 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -448,6 +448,10 @@ void qemuDomainObjReleaseAsyncJob(virDomainObjPtr obj);
qemuMonitorPtr qemuDomainGetMonitor(virDomainObjPtr vm)
ATTRIBUTE_NONNULL(1);
+
+int
+qemuDomainGetMonitorTimeout(virDomainDefPtr def);
+
void qemuDomainObjEnterMonitor(virQEMUDriverPtr driver,
virDomainObjPtr obj)
ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2);
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index d71f84c80..641f801fb 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -327,7 +327,7 @@ qemuMonitorDispose(void *obj)
static int
-qemuMonitorOpenUnix(const char *monitor, pid_t cpid)
+qemuMonitorOpenUnix(const char *monitor, pid_t cpid, int monTimeout)
{
struct sockaddr_un addr;
int monfd;
@@ -348,7 +348,7 @@ qemuMonitorOpenUnix(const char *monitor, pid_t cpid)
goto error;
}
- if (virTimeBackOffStart(&timeout, 1, 30*1000 /* ms */) < 0)
+ if (virTimeBackOffStart(&timeout, 1, monTimeout * 1000 /* ms */) < 0)
goto error;
while (virTimeBackOffWait(&timeout)) {
ret = connect(monfd, (struct sockaddr *) &addr, sizeof(addr));
@@ -881,11 +881,12 @@ qemuMonitorOpen(virDomainObjPtr vm,
int fd;
bool hasSendFD = false;
qemuMonitorPtr ret;
+ int monTimeout = qemuDomainGetMonitorTimeout(vm->def);
switch (config->type) {
case VIR_DOMAIN_CHR_TYPE_UNIX:
hasSendFD = true;
- if ((fd = qemuMonitorOpenUnix(config->data.nix.path, vm ? vm->pid : 0)) < 0)
+ if ((fd = qemuMonitorOpenUnix(config->data.nix.path, vm ? vm->pid : 0, monTimeout)) < 0)
return NULL;
break;
--
2.11.0
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list