Re: [PATCH 7/9] qemu: Grab modify job for changing domain XML

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/4/19 11:07 AM, Nikolay Shirokovskiy wrote:


On 30.05.2019 12:34, Michal Privoznik wrote:
Changing domain definition must be guarded with acquiring modify
job. The problem is if there is a thread executing say
qemuDomainSetMemoryStatsPeriod() which is an API that acquires
modify job and then possibly unlock the domain object and locks
it again. However, becasue virDomainObjAssignDef() does not take
a job (only object lock) it may have changed the domain
definition while the other thread unlocked the domain object in
order to talk on the monitor. For instance:

Thread1:
1) lookup domain object and lock it
2) acquire job
3) get pointers to live and persistent defs
4) unlock the domain object
5) talk to qemu on the monitor

Thread2 (Execute qemuDomainDefineXMLFlags):
1) lookup domain object and lock it
2) virDomainObjAssignDef() which overwrites persistent def and
    frees the old one
3) unlock domain object

Thread1:
6) lock the domain object again
7) try to access the persistent def via pointer stored in 3)

Now this will crash because the pointer from step 3) points to a
memory that was freed.

However, if Thread2 waited and acquired modify job (just like
Thread1) then these two would serialize and no harm would be
done.

Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx>
---
  src/qemu/qemu_domain.c    |  3 ++-
  src/qemu/qemu_driver.c    | 29 +++++++++++++++++++++++++----
  src/qemu/qemu_migration.c |  7 +++++++
  3 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index c61d39b12b..fff2f1932e 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -8784,7 +8784,8 @@ qemuDomainRemoveInactive(virQEMUDriverPtr driver,
          return;
      }
- qemuDomainRemoveInactiveCommon(driver, vm);
+    if (vm->def)
+        qemuDomainRemoveInactiveCommon(driver, vm);
virDomainObjListRemove(driver->domains, vm);

Hmm, virDomainObjListRemoveLocked uses vm->def too.

  }
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 8bbac339e0..77cb7e4b87 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -1706,9 +1706,16 @@ static virDomainPtr qemuDomainCreateXML(virConnectPtr conn,
                                     VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE)))
          goto cleanup;
+ if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) {
+        qemuDomainRemoveInactive(driver, vm);
+        goto cleanup;
+    }
+
      virDomainObjAssignDef(vm, def, true, NULL);
      def = NULL;
+ qemuDomainObjEndJob(driver, vm);
+

Looks like we can begin VIR_DOMAIN_JOB_OPERATION_START from the start instead
of acquiring job twice.

That is an async job. We shouldn't rely on async job when modifying a domain.


      if (qemuProcessBeginJob(driver, vm, VIR_DOMAIN_JOB_OPERATION_START,
                              flags) < 0) {
          qemuDomainRemoveInactiveJob(driver, vm);
@@ -6976,6 +6983,11 @@ qemuDomainRestoreFlags(virConnectPtr conn,
                                     VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE)))
          goto cleanup;
+ if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) {
+        qemuDomainRemoveInactive(driver, vm);
+        goto cleanup;
+    }
+
      virDomainObjAssignDef(vm, def, true, NULL);
      def = NULL;
@@ -6988,6 +7000,7 @@ qemuDomainRestoreFlags(virConnectPtr conn,
          priv = vm->privateData;
          priv->hookRun = true;
      }> +    qemuDomainObjEndJob(driver, vm);

Same here

if (qemuProcessBeginJob(driver, vm, VIR_DOMAIN_JOB_OPERATION_RESTORE,
                              flags) < 0)
@@ -7651,6 +7664,11 @@ qemuDomainDefineXMLFlags(virConnectPtr conn,
                                     driver->xmlopt, 0)))
          goto cleanup;
+ if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) {
+        qemuDomainRemoveInactive(driver, vm);
+        goto cleanup;
+    }
+
      virDomainObjAssignDef(vm, def, false, &oldDef);
      def = NULL;
@@ -7673,7 +7691,7 @@ qemuDomainDefineXMLFlags(virConnectPtr conn,
              vm->persistent = 0;
              qemuDomainRemoveInactiveJob(driver, vm);
          }
-        goto cleanup;
+        goto endjob;
      }
event = virDomainEventLifecycleNewFromObj(vm,
@@ -7685,6 +7703,9 @@ qemuDomainDefineXMLFlags(virConnectPtr conn,
      VIR_INFO("Creating domain '%s'", vm->def->name);
      dom = virGetDomain(conn, vm->def->name, vm->def->uuid, vm->def->id);
+ endjob:
+    qemuDomainObjEndJob(driver, vm);
+
   cleanup:
      virDomainDefFree(oldDef);
      virDomainDefFree(def);

Additionally we need to replace qemuDomainRemoveInactiveJob with qemuDomainRemoveInactive
as we have a job already.

I'm planning on wrapping what you suggested in your reply to the cover letter into a single function. This way, the change won't be that intrusive and a lot of these changes won't be made.

Michal

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux