On Tue, 2017-11-07 at 08:43 -0500, John Ferlan wrote: > > On 11/07/2017 04:18 AM, Cedric Bosdonnat wrote: > > On Mon, 2017-11-06 at 15:53 -0500, John Ferlan wrote: > > > Resolve a storage driver crash as a result of a long running > > > storageVolCreateXML when the virStorageVolPoolRefreshThread is > > > run as a result of a storageVolUpload complete and ran the > > > virStoragePoolObjClearVols without checking if the creation > > > code was currently processing a buildVol after incrementing > > > the driver->asyncjob count. > > > > > > The refreshThread needs to wait until all creation threads > > > are completed so as to not alter the volume list while the > > > volume is being built. > > > > > > Crash from valgrind is as follows (with a bit of editing): > > > > > > ==21309== Invalid read of size 8 > > > ==21309== at 0x153E47AF: storageBackendUpdateVolTargetInfo > > > ==21309== by 0x153E4C30: virStorageBackendUpdateVolInfo > > > ==21309== by 0x153E52DE: virStorageBackendVolRefreshLocal > > > ==21309== by 0x153DE29E: storageVolCreateXML > > > ==21309== by 0x562035B: virStorageVolCreateXML > > > ==21309== by 0x147366: remoteDispatchStorageVolCreateXML > > > ... > > > ==21309== Address 0x2590a720 is 64 bytes inside a block of size 336 free'd > > > ==21309== at 0x4C2F2BB: free > > > ==21309== by 0x54CB9FA: virFree > > > ==21309== by 0x55BC800: virStorageVolDefFree > > > ==21309== by 0x55BF1D8: virStoragePoolObjClearVols > > > ==21309== by 0x153D967E: virStorageVolPoolRefreshThread > > > ... > > > ==21309== Block was alloc'd at > > > ==21309== at 0x4C300A5: calloc > > > ==21309== by 0x54CB483: virAlloc > > > ==21309== by 0x55BDC1F: virStorageVolDefParseXML > > > ==21309== by 0x55BDC1F: virStorageVolDefParseNode > > > ==21309== by 0x55BE5A4: virStorageVolDefParse > > > ==21309== by 0x153DDFF1: storageVolCreateXML > > > ==21309== by 0x562035B: virStorageVolCreateXML > > > ==21309== by 0x147366: remoteDispatchStorageVolCreateXML > > > ... > > > > > > Signed-off-by: John Ferlan <jferlan@xxxxxxxxxx> > > > --- > > > src/storage/storage_driver.c | 9 +++++++++ > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c > > > index b0edf9f885..5e920fc14c 100644 > > > --- a/src/storage/storage_driver.c > > > +++ b/src/storage/storage_driver.c > > > @@ -2257,6 +2257,7 @@ virStorageVolPoolRefreshThread(void *opaque) > > > virStorageBackendPtr backend; > > > virObjectEventPtr event = NULL; > > > > > > + retry: > > > storageDriverLock(); > > > if (cbdata->vol_path) { > > > if (virStorageBackendPloopRestoreDesc(cbdata->vol_path) < 0) > > > @@ -2270,6 +2271,14 @@ virStorageVolPoolRefreshThread(void *opaque) > > > if (!(backend = virStorageBackendForType(def->type))) > > > goto cleanup; > > > > > > + /* Some thread is creating a new volume in the pool, we need to retry */ > > > + if (virStoragePoolObjGetAsyncjobs(obj) > 0) { > > > + virStoragePoolObjUnlock(obj); > > > + storageDriverUnlock(); > > > + usleep(100 * 1000); > > > + goto retry; > > > + } > > > + > > > virStoragePoolObjClearVols(obj); > > > if (backend->refreshPool(NULL, obj) < 0) > > > VIR_DEBUG("Failed to refresh storage pool"); > > > > ACK, does the job here. I'm rather surprised to see you implementing it > > with sleep, while you pointed me towards virCondWait yesterday. But using > > sleep is a way less intrusive change. > > > > Well you only asked about an alternative mechanism and condition > variable was the first thing that popped into my mind; however, as I got > to actually doing more thinking about it - asyncjobs is not blocking > multiple creates from occurring; whereas, a condition variable would be > waiting for one thing to complete. > > My response to Jan also lists 2 other alternatives. This was just the > "easiest" of the 3. If there's other ideas, I'm open to suggestions. > It looks fine to me, I was just surprised to see the sleep version ;) -- Cedric -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list