On 2015/4/23 17:46, Michal Privoznik wrote: > On 23.04.2015 11:32, Peter Krempa wrote: >> On Thu, Apr 23, 2015 at 11:19:34 +0200, Michal Privoznik wrote: >>> On 23.04.2015 10:30, Peter Krempa wrote: >>>> On Thu, Apr 23, 2015 at 11:44:49 +0800, zhang bo wrote: >>>>> The function qemuBuildCommandLine() may take a long time, for example >>>>> if we configure tens of vifs for the guest, each may cost hundrands of >>>>> milliseconds to create tap dev, senconds in total. Thus, unlock vm >>>>> before calling it. >>>>> >>>>> Signed-off-by: Zhang Bo <oscar.zhangbo@xxxxxxxxxx> >>>>> Signed-off-by: Zhou Yimin <zhouyimin@xxxxxxxxxx> >>>>> --- >>>>> src/qemu/qemu_process.c | 6 +++++- >>>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c >>>>> index 753afe8..d1aaaec 100644 >>>>> --- a/src/qemu/qemu_process.c >>>>> +++ b/src/qemu/qemu_process.c >>>>> @@ -4628,14 +4628,18 @@ int qemuProcessStart(virConnectPtr conn, >>>>> } >>>>> >>>>> VIR_DEBUG("Building emulator command line"); >>>>> + virObjectUnlock(vm); >>>>> if (!(cmd = qemuBuildCommandLine(conn, driver, vm->def, priv->monConfig, >>>>> priv->monJSON, priv->qemuCaps, >>>>> migrateFrom, stdin_fd, snapshot, vmop, >>>>> &buildCommandLineCallbacks, false, >>>>> qemuCheckFips(), >>>>> priv->autoNodeset, >>>>> - &nnicindexes, &nicindexes))) >>>>> + &nnicindexes, &nicindexes))) { >>>>> + virObjectLock(vm); >>>>> goto cleanup; >>>>> + } >>>>> + virObjectLock(vm); >>>> >>>> Why do you need to unlock the object? The VM is starting at this point >>>> so you won't be able to save any time since APIs will either be blocked >>>> by a job or by the fact that the VM was not started. >>> >>> Not true. We still allow QUERY jobs, or APIs that lock a domain object >>> but don't necessarily take a job. For instance, if in one thread I'm >>> doing virDomainGetOSType(), virDomainGetInfo(), ... (essentially 'virsh >>> dominfo') while the other just starts the domain. I don't see a reason >>> why virDomainGetOSType() should wait for the domain to be started up. >>> Domain state should have no affect on the guest OS type, should it? >> >> OK that is technically right. I wanted to point out that most of the >> stats are available only for online machines or it shouldn't be much of >> a problem to dealy the delivery. >> >> Your example certainly doesn't illustrate why the delay to format the >> command line should be a problem when using libvirt. Or are you polling >> virDomainGetOSType every millisecond? >> >> I am curious to see why the delay would be a problem. > > Yep, I'm too. So far we don't really care much about libvirt response > time (otherwise our critical sections would be much shorter), but maybe > it's an issue for somebody. > The specific semantic is: migrating multiple guests simultaneously, downtime of each guest will add up, even to an unacceptable value. 1 suppose the downtime is 2 seconds when we migrate only 1 guest at one time. 2 suppose it costs 0.5sec to create each tapDev, and each guest has 20 vifs, that's 10 seconds for qemuBuildCommandLine. 3 now, we migrate 10 guest simultaneously, the downtime of the guests will vary from seconds to even 100 seconds. The reason for the problem is that: 1 guestA locks vm while creating each tapDev(virNetDevTapCreate) in qemuBuildCommandLine(), for 10seconds 2 guestB calls qemuMigrationPrepareAny->*virDomainObjListAdd* to get its vm object, which locks 'doms' and waits for the vm lock. 3 doms will be locked until guestA unlock its vm, we say that's 10 seconds. 4 guestC calls qemuDomainMigrateFinish3->virDomainObjListFindByName, which tries to lock doms. because it's now locked by guestB, guestC blocks here, and it can't be unpaused for at least 10 seconds. 5 then comes to guestD, guestE, guestF, etc, the downtime will be added up, to even 50 seconds or more. 6 the command 'virsh list' is blocked as well. Thus, we think the problem must be solved. >> >>> >>> On the other hand, I don't think we can just lock and unlock the domain >>> object as we please. qemuBuildCommandLine is a very complex function and >>> as such it calls many others. Some of them may rely on the fact, that >>> the object is locked by caller. >> >> Well, you definitely can't since almost everything in there is accessing >> vm->def. Locking semantics would be broken right in the line after @vm >> was unlocked by dereferencing it. > > Well, anything that would change a domain definition should grap a > MODIFY job. But such jobs are serialized, so even if we unlock the > domain we should be okay to still access vm->def. What I am more worried > about are all the small functions that interact with system or > something. Currently they are serialized by @vm lock, but one we remove > it ... > > Michal > > . > After the discussion above, maybe it's better to move virNetDevTapCreate() prior to qemuBuildCommandLine(), what do you think about that? If that's ok, I'd like to apply patchV2 here. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list