On 11/16/2010 08:13 AM, Daniel P. Berrange wrote:
On Tue, Nov 16, 2010 at 08:09:35AM -0500, Stefan Berger wrote:
On 11/10/2010 07:43 AM, Stefan Berger wrote:
On 11/10/2010 05:41 AM, Daniel P. Berrange wrote:
On Tue, Nov 09, 2010 at 07:43:23PM -0500, Stefan Berger wrote:
I am replacing the last instances of close() I found with VIR_CLOSE() /
VIR_FORCE_CLOSE respectively.
The first patch of virsh I missed out on previously.
The 2nd patch I had left out intentionally to look at it more
carefully:
The 'closed' variable could be easily removed since it wasn't used
anywhere else. The possible race condition that could result from the
filedescriptor being closed and not set to -1 (and possibly let us
write
into 'something' totally different if the fd was allocated by another
thread) seems to be prevented by the qemuMonitorLock() already placed
around the code that reads from or writes to the fd. So the change of
this code as shown in the patch should not have any side-effects.
Signed-off-by: Stefan Berger<stefanb@xxxxxxxxxx>
--- libvirt-acl.orig/src/qemu/qemu_monitor.c
+++ libvirt-acl/src/qemu/qemu_monitor.c
@@ -73,8 +73,6 @@ struct _qemuMonitor {
/* If the monitor EOF callback is currently active (stops more
commands being run) */
unsigned eofcb: 1;
- /* If the monitor is in process of shutting down */
- unsigned closed: 1;
unsigned json: 1;
};
@@ -692,17 +690,11 @@ void qemuMonitorClose(qemuMonitorPtr mon
VIR_DEBUG("mon=%p", mon);
qemuMonitorLock(mon);
- if (!mon->closed) {
+
+ if (mon->fd>= 0) {
if (mon->watch)
virEventRemoveHandle(mon->watch);
- if (mon->fd != -1)
- close(mon->fd);
- /* NB: ordinarily one might immediately set mon->watch to -1
- * and mon->fd to -1, but there may be a callback active
- * that is still relying on these fields being valid. So
- * we merely close them, but not clear their values and
- * use this explicit 'closed' flag to track this state */
- mon->closed = 1;
+ VIR_FORCE_CLOSE(mon->fd);
}
Err, the comment you deleted here explains why this change is not
safe.
If I looked at the code correctly then this here is the callback:
static void
qemuMonitorIO(int watch, int fd, int events, void *opaque) {
qemuMonitorPtr mon = opaque;
int quit = 0, failed = 0;
qemuMonitorLock(mon);
qemuMonitorRef(mon);
#if DEBUG_IO
VIR_DEBUG("Monitor %p I/O on watch %d fd %d events %d", mon, watch, f
#endif
if (mon->fd != fd || mon->watch != watch) {
VIR_ERROR(_("event from unexpected fd %d!=%d / watch %d!=%d"), mo
failed = 1;
} else {
[...]
It grabs the lock and then subsequently, with the lock held, calls
into qemuMonitorIOWrite and qemuMonitorIORead (not shown, this is done
in the [...]), which then access the mon->fd. Now the above
VIR_FORCE_CLOSE() function is also being called with the lock held,
thus serializes the access to the file descriptor (mon->fd). When the
'if (mon->fd != fd)' is evaluate (while the lock is held) after the fd
was closed and now set to -1, then we won't read from a non-open file
descriptor anymore, which would otherwise occur in the else branch.
Otherwise, if fd->mon != -1, it would do the reads and writes as
intended and the above close could not happen while it is doing that.
When I made the above change I first introduced another lock
'mon->monFDLock' to serialize the access to the fd, but then I saw
that the qemuMonitorLock() was already doing that as well, so I ended
up removing it again entirely. Well, to me this was convincing so that
the above changes seemed 'safe'. Actually, I think the previous code
could write and read to a file descriptor that doesn't communicate
with the monitor anymore but may belong to another thread now (since
the mon->fd wasn't set to -1, but kept as-is). So, to me this is
actually an improvement.
Did I convince you?
Yes& no. On the surface it looks ok, but I'm still a little worried about
the implications, because I put this code in there to avoid some nasty
re-entrancy problems. I guessing some later re-factoring might have fixed
those problems making this code redundant. So I'll ACK for nwo, but if we
notice any unusual behaviour with the monitor, then this is the place to
look...
The re-entrancy that was previously avoided with the 'closed' flag is
now checked with the file descriptor being >= 0, both with the lock
being held, so no concurrency is possible.
The most dangerous thing IMO is that someone changes the locking later
on for whatever reason. There really should be a comment on the
functions reading and writing from/to the mon->fd that they need to be
called with that lock held. Also that the 'mon-fd' is protected by the
existing lock may be worth a comment.
I'd push unless you want me to add comments to the file descriptor and
functions requiring the lock being held.
Stefan
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list