On 03/31/2013 10:20 AM, Peter Krempa wrote: > The last Viktor's effort to fix the race and memory corruption unfortunately > wasn't complete in the case the close callback was not registered in an > connection. At that time, the trail of event's that I'll describe later could > still happend and corrupt the memory or cause a crash of the client (including s/happend/happen/ > the daemon in case of a p2p migration). > > Consider the following prerequisities and trail of events: > Let's have a remote connection to a hypervisor that doesn't have a close > callback registered and the client is using the event loop. The crash happens in > cooperation of 2 threads. Thread E is the event loop and thread W is the worker > that does some stuff. R denotes the remote client. > > 1.) W - The client finishes everything and sheds the last reference on the client > 2.) W - The virObject stuff invokes virConnectDispose that invokes doRemoteClose > 3.) W - the remote close method invokes the REMOTE_PROC_CLOSE RPC method. > 4.) W - The thread is preempted at this point. > 5.) R - The remote side recieves the close and closes the socket. s/recieves/receives/ > 6.) E - poll() wakes up due to the closed socket and invokes the close callback > 7.) E - The event loop is preempted right before remoteClientCloseFunc is called > 8.) W - The worker now finishes, and frees the conn object. > 9.) E - The remoteClientCloseFunc accesses the now-freed conn object in the > attempt to retrieve pointer for the real close callback. > 10.) Kaboom, corrupted memory/segfault. > > This patch tries to fix this by introducing a new object that survives the > freeing of the connection object. We can't increase the reference count on the > connection object itself as the connection would never be closed as the s/itself as/itself or/; s/closed as/closed, as/ > connection is closed only when the reference count reaches zero. > > The new object - virConnectCloseCallbackData - is a lockable object that keeps > the pointers to the real user registered callback and ensures that the > connection callback is either not called if the connection was already freed or > that the connection isn't freed while this is being called. > --- > src/datatypes.c | 55 ++++++++++++++++++++++++++++++++++++-------- > src/datatypes.h | 22 ++++++++++++++---- > src/libvirt.c | 29 ++++++++++++----------- > src/remote/remote_driver.c | 57 +++++++++++++++++++++++++++------------------- > 4 files changed, 112 insertions(+), 51 deletions(-) So far, this is just a code review. I'm also planning on testing the patch, and will report back with results later in the day. > @@ -63,14 +65,19 @@ static void virStoragePoolDispose(void *obj); > static int > virDataTypesOnceInit(void) > { > -#define DECLARE_CLASS(basename) \ > - if (!(basename ## Class = virClassNew(virClassForObject(), \ > +#define DECLARE_CLASS_COMMON(basename, parent) \ > + if (!(basename ## Class = virClassNew(parent, \ > #basename, \ > sizeof(basename), \ > basename ## Dispose))) \ > return -1; > +#define DECLARE_CLASS(basename) \ > + DECLARE_CLASS_COMMON(basename, virClassForObject()) > +#define DECLARE_CLASS_LOCKABLE(basename) \ > + DECLARE_CLASS_COMMON(basename, virClassForObjectLockable()) Wonder if these definitions are useful enough to rename to VIR_DECLARE_* and move into virobject.h. But that would be a separate patch, to adjust all clients that would benefit from it. > +++ b/src/datatypes.h > @@ -93,6 +93,22 @@ extern virClassPtr virStoragePoolClass; > # define VIR_IS_DOMAIN_SNAPSHOT(obj) \ > (VIR_IS_SNAPSHOT(obj) && VIR_IS_DOMAIN((obj)->domain)) > > + > +typedef struct _virConnectCloseCallbackData virConnectCloseCallbackData; > +typedef virConnectCloseCallbackData *virConnectCloseCallbackDataPtr; > + > +/** > + * Internal structure holding data related to connection close callbacks. > + */ > +struct _virConnectCloseCallbackData { > + virObjectLockable parent; > + > + virConnectPtr conn; > + virConnectCloseFunc callback; > + void *opaque; > + virFreeCallback freeCallback; > +}; Would it make sense to move this struct definition to be local to datatypes.c, and have all other uses of it treat it as opaque?... > +++ b/src/libvirt.c > @@ -20189,24 +20189,27 @@ int virConnectRegisterCloseCallback(virConnectPtr conn, > virObjectRef(conn); > > virMutexLock(&conn->lock); > + virObjectLock(conn->closeCallback); > > virCheckNonNullArgGoto(cb, error); > > - if (conn->closeCallback) { > + if (conn->closeCallback->callback) { ...But then you would need to expose an accessor function instead of directly accessing closeCallback->callback. Okay, probably fine as is. > +++ b/src/remote/remote_driver.c > @@ -337,32 +337,38 @@ enum virDrvOpenRemoteFlags { > VIR_DRV_OPEN_REMOTE_AUTOSTART = (1 << 2), /* Autostart a per-user daemon */ > }; > > +static void > +remoteClientCloseFreeFunc(void *opaque) > +{ > + virConnectCloseCallbackDataPtr cbdata = opaque; > + > + virObjectUnref(cbdata); > +} You shouldn't need this; just use virObjectFreeCallback instead. > +remoteClientCloseFunc(virNetClientPtr client ATTRIBUTE_UNUSED, > + int reason, > + void *opaque) > { > - virConnectPtr conn = opaque; > + virConnectCloseCallbackDataPtr cbdata = opaque; > > - virMutexLock(&conn->lock); > - if (conn->closeCallback) { > - virConnectCloseFunc closeCallback = conn->closeCallback; > - void *closeOpaque = conn->closeOpaque; > - virFreeCallback closeFreeCallback = conn->closeFreeCallback; > - unsigned closeUnregisterCount = conn->closeUnregisterCount; > + virObjectLock(cbdata); > > - VIR_DEBUG("Triggering connection close callback %p reason=%d", > - conn->closeCallback, reason); > - conn->closeDispatch = true; > - virMutexUnlock(&conn->lock); > - closeCallback(conn, reason, closeOpaque); > - virMutexLock(&conn->lock); > - conn->closeDispatch = false; > - if (conn->closeUnregisterCount != closeUnregisterCount && > - closeFreeCallback) > - closeFreeCallback(closeOpaque); > + if (cbdata->callback) { > + VIR_DEBUG("Triggering connection close callback %p reason=%d, opaque=%p", > + cbdata->callback, reason, cbdata->opaque); > + cbdata->callback(cbdata->conn, reason, cbdata->opaque); > + > + if (cbdata->freeCallback) > + cbdata->freeCallback(cbdata->opaque); > + cbdata->callback = NULL; > + cbdata->freeCallback = NULL; > } > - virMutexUnlock(&conn->lock); > + virObjectUnlock(cbdata); > + > + /* free the connection reference that comes along with the callback > + * registration */ > + virObjectUnref(cbdata->conn); > } Took me a couple reads, but it looks right. The old code had to temporarily drop connection lock while using the callback; the new code never even grabs connection lock because all the callback data is encapsulated in a separate object. > > /* helper macro to ease extraction of arguments from the URI */ > @@ -765,9 +771,11 @@ doRemoteOpen(virConnectPtr conn, > goto failed; > } > > + virObjectRef(conn->closeCallback); > + > virNetClientSetCloseCallback(priv->client, > remoteClientCloseFunc, > - conn, NULL); > + conn->closeCallback, remoteClientCloseFreeFunc); Here's where virObjectFreeCallback comes in handy. On the surface, the code seems reasonable. If my testing passes, and you fix up the typos and the unneeded helper function, then I don't mind giving: ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list