Re: an AB deadlock or libvirtd crash problem in virsh console and virsh destroy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04.08.2016 14:37, weifuqiang wrote:

> The reason of this problem is that fdstream abort event or close event occured at the same time, libvirtd doesn't deal with the synchronousness well enough.
> the flows about fdStream is bellow
> 1、qemuDomainDefineXMLFlags -> virDomainObjListAdd  -> qemuDomainObjPrivateAlloc -> virChrdevAlloc -> virHashCreate
> 2、qemuDomainOpenConsole -> virChrdevOpen -> virHashAddEntry(devs->hash, path, st)
> 3、virDomainObjDispose -> privateDataFreeFunc (qemuDomainObjPrivateFree) - > virChrdevFree(*dev locked*) -> virChrdevFreeClearCallbacks - >  virFDStreamSetInternalCloseCb(*fdst locked*)
> 4、virFDStreamCloseInt (*fdst locked*) -> icbFreeOpaque(virChrdevFDStreamCloseCb (*dev locked*)) -> virHashRemoveEntry

Thank you for your very detailed description of the problem.

> 
> The AB lock problem is obviouse: in step 3, it locks chardev before fdst, and in step 4, it's the opsite way.
> The reason of libvirtd crash is that: in virFDStreamCloseInt function we set fdst to NULL, while in virFDStreamSetInternalCloseCb we use fdst->lock, note that fdst has already been freed.
> another crash problem occurs because that when virChrdevFree was earlier finished, dev->hash freed and all date of hash is freed, but fdstream event flow use fdStream after hash free. Ahha~~, libvirtd coredump.
> 
> All of those problem is because  clear vm flow and fdStream flow concurs synchronously.
> 
> then I fix this problem by modify virChrdevFree()
> 
> void virChrdevFree(virChrdevsPtr devs)
> {
>     if (!devs || !devs->hash)
>         return;
> 
>     for (;;) {
>         virMutexLock(&devs->lock);
>         if (0 == virHashSize(devs->hash)) {
>             virMutexUnlock(&devs->lock);
>             break;
>         }
>         virMutexUnlock(&devs->lock);
>         usleep(10 * 1000);
>     }
> 
>     virMutexLock(&devs->lock);
>     virHashFree(devs->hash);
>     virMutexUnlock(&devs->lock);
>     virMutexDestroy(&devs->lock);
>     VIR_FREE(devs);

Usually, a race fix that contains usleep() is not really a fix.

> }
> If the chardev is removed by fdStream close or fdStream abort when vm destroy or shutdown, the modification works well. But I'm not sure is that all chardev would be removed when we clear vm,  if not, it would always sleep here.
> 
> Another solution is as follows:
> 
> virMutexLock(vm);
> virChrdevFree();
> virMutexUnlock(vm);
> 
> virMutexLock(vm);
> virFDStreamCloseInt();
> virMutexUnlock(vm);

I like this one more.

However, what I'm thinking is:
1) looks like when qemuDomainObjPrivateFree() is called, it's a lost
game thereafter. I mean, qemuDomainObjPrivateFree() callsirChrdevFree()
which destroys the virChrdevs::mutex. Any later attempts to lock it will
result in crash (or something similarly nasty). Therefore I think in
virChrdevFree() we should unregister the stream close callbacks (icbCb
and icbFreeOpaque). We are doing the cleanup anyway.


2) I'm wondering whether it is necessary to have the close callbacks
(icbCb and icbFreeOpaque) guarded with stream mutex locked. Looks to me
like we could unlock the stream while the callback are running and then
lock it back again.

Michal

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]