On 11/07/2017 04:18 AM, Cedric Bosdonnat wrote: > On Mon, 2017-11-06 at 15:53 -0500, John Ferlan wrote: >> Resolve a storage driver crash as a result of a long running >> storageVolCreateXML when the virStorageVolPoolRefreshThread is >> run as a result of a storageVolUpload complete and ran the >> virStoragePoolObjClearVols without checking if the creation >> code was currently processing a buildVol after incrementing >> the driver->asyncjob count. >> >> The refreshThread needs to wait until all creation threads >> are completed so as to not alter the volume list while the >> volume is being built. >> >> Crash from valgrind is as follows (with a bit of editing): >> >> ==21309== Invalid read of size 8 >> ==21309== at 0x153E47AF: storageBackendUpdateVolTargetInfo >> ==21309== by 0x153E4C30: virStorageBackendUpdateVolInfo >> ==21309== by 0x153E52DE: virStorageBackendVolRefreshLocal >> ==21309== by 0x153DE29E: storageVolCreateXML >> ==21309== by 0x562035B: virStorageVolCreateXML >> ==21309== by 0x147366: remoteDispatchStorageVolCreateXML >> ... >> ==21309== Address 0x2590a720 is 64 bytes inside a block of size 336 free'd >> ==21309== at 0x4C2F2BB: free >> ==21309== by 0x54CB9FA: virFree >> ==21309== by 0x55BC800: virStorageVolDefFree >> ==21309== by 0x55BF1D8: virStoragePoolObjClearVols >> ==21309== by 0x153D967E: virStorageVolPoolRefreshThread >> ... >> ==21309== Block was alloc'd at >> ==21309== at 0x4C300A5: calloc >> ==21309== by 0x54CB483: virAlloc >> ==21309== by 0x55BDC1F: virStorageVolDefParseXML >> ==21309== by 0x55BDC1F: virStorageVolDefParseNode >> ==21309== by 0x55BE5A4: virStorageVolDefParse >> ==21309== by 0x153DDFF1: storageVolCreateXML >> ==21309== by 0x562035B: virStorageVolCreateXML >> ==21309== by 0x147366: remoteDispatchStorageVolCreateXML >> ... >> >> Signed-off-by: John Ferlan <jferlan@xxxxxxxxxx> >> --- >> src/storage/storage_driver.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c >> index b0edf9f885..5e920fc14c 100644 >> --- a/src/storage/storage_driver.c >> +++ b/src/storage/storage_driver.c >> @@ -2257,6 +2257,7 @@ virStorageVolPoolRefreshThread(void *opaque) >> virStorageBackendPtr backend; >> virObjectEventPtr event = NULL; >> >> + retry: >> storageDriverLock(); >> if (cbdata->vol_path) { >> if (virStorageBackendPloopRestoreDesc(cbdata->vol_path) < 0) >> @@ -2270,6 +2271,14 @@ virStorageVolPoolRefreshThread(void *opaque) >> if (!(backend = virStorageBackendForType(def->type))) >> goto cleanup; >> >> + /* Some thread is creating a new volume in the pool, we need to retry */ >> + if (virStoragePoolObjGetAsyncjobs(obj) > 0) { >> + virStoragePoolObjUnlock(obj); >> + storageDriverUnlock(); >> + usleep(100 * 1000); >> + goto retry; >> + } >> + >> virStoragePoolObjClearVols(obj); >> if (backend->refreshPool(NULL, obj) < 0) >> VIR_DEBUG("Failed to refresh storage pool"); > > ACK, does the job here. I'm rather surprised to see you implementing it > with sleep, while you pointed me towards virCondWait yesterday. But using > sleep is a way less intrusive change. > Well you only asked about an alternative mechanism and condition variable was the first thing that popped into my mind; however, as I got to actually doing more thinking about it - asyncjobs is not blocking multiple creates from occurring; whereas, a condition variable would be waiting for one thing to complete. My response to Jan also lists 2 other alternatives. This was just the "easiest" of the 3. If there's other ideas, I'm open to suggestions. John > -- > Cedric > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list