Martin Kletzander wrote: > On Tue, Mar 24, 2015 at 02:43:42PM -0600, Jim Fehlig wrote: >> Recent testing on large memory systems revealed a bug in the Xen xl >> tool's freemem() function. When autoballooning is enabled, freemem() >> is used to ensure enough memory is available to start a domain, >> ballooning dom0 if necessary. When ballooning large amounts of memory >> from dom0, freemem() would exceed its self-imposed wait time and >> return an error. Meanwhile, dom0 continued to balloon. Starting the >> domain later, after sufficient memory was ballooned from dom0, would >> succeed. The libvirt implementation in libxlDomainFreeMem() suffers >> the same bug since it is modeled after freemem(). >> >> In the end, the best place to fix the bug on the Xen side was to >> slightly change the behavior of libxl_wait_for_memory_target(). >> Instead of failing after caller-provided wait_sec, the function now >> blocks as long as dom0 memory ballooning is progressing. It will return >> failure only when more memory is needed to reach the target and wait_sec >> have expired with no progress being made. See xen.git commit fd3aa246. >> There was a dicussion on how this would affect other libxl apps like >> libvirt >> >> http://lists.xen.org/archives/html/xen-devel/2015-03/msg00739.html >> >> If libvirt containing this patch was build against a Xen containing >> the old libxl_wait_for_memory_target() behavior, libxlDomainFreeMem() >> will fail after 30 sec and domain creation will be terminated. >> Without this patch and with old libxl_wait_for_memory_target() behavior, >> libxlDomainFreeMem() does not succeed after 30 sec, but returns success >> anyway. Domain creation continues resulting in all sorts of fun stuff >> like cpu soft lockups in the guest OS. It was decided to properly fix >> libxl_wait_for_memory_target(), and if anything improve the default >> behavior of apps using the freemem reference impl in xl. >> >> xl was patched to accommodate the change in >> libxl_wait_for_memory_target() >> with xen.git commit 883b30a0. This patch does the same in the libxl >> driver. While at it, I changed the logic to essentially match >> freemem() in $xensrc/tools/libxl/xl_cmdimpl.c. It was a bit cleaner >> IMO and will make it easier to spot future, potentially interesting >> divergences. >> >> Signed-off-by: Jim Fehlig <jfehlig@xxxxxxxx> >> --- >> >> V2: Actually use libxl_wait_for_memory_target(), instead of >> libxl_wait_for_free_memory() >> >> src/libxl/libxl_domain.c | 55 >> +++++++++++++++++++++++------------------------- >> 1 file changed, 26 insertions(+), 29 deletions(-) >> >> diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c >> index 407a9bd..a1739aa 100644 >> --- a/src/libxl/libxl_domain.c >> +++ b/src/libxl/libxl_domain.c >> @@ -1121,38 +1121,39 @@ libxlDomainFreeMem(libxlDomainObjPrivatePtr >> priv, libxl_domain_config *d_config) >> { >> uint32_t needed_mem; >> uint32_t free_mem; >> - size_t i; >> - int ret = -1; >> + int ret; > > This variable is unnecessary and confusing since you don't return it. > And without this, you can make ... > >> int tries = 3; >> int wait_secs = 10; >> >> - if ((ret = libxl_domain_need_memory(priv->ctx, &d_config->b_info, >> - &needed_mem)) >= 0) { >> - for (i = 0; i < tries; ++i) { >> - if ((ret = libxl_get_free_memory(priv->ctx, &free_mem)) >> < 0) >> - break; >> + ret = libxl_domain_need_memory(priv->ctx, &d_config->b_info, >> &needed_mem); >> + if (ret < 0) >> + goto error; >> > > ... these conditions smaller, e.g.: > > if (libxl_domain_need_memory(priv->ctx, &d_config->b_info, > &needed_mem) < 0) > goto error; Yeah, too much copy and paste. I'll fix this up and send a V3. > Do you need this for the release? (sorry for noticing that late) No, this can wait. From a Xen perspective, more useful work for the release would have been this series https://www.redhat.com/archives/libvir-list/2015-March/msg01337.html But I think it is a bit too late now :-(. Regards, Jim -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list