On 02/28/2013 11:16 AM, Daniel P. Berrange wrote: > On Thu, Feb 28, 2013 at 11:11:53AM -0500, Laine Stump wrote: >> On 02/28/2013 08:37 AM, Daniel P. Berrange wrote: >>> From: "Daniel P. Berrange" <berrange@xxxxxxxxxx> >>> >>> The nl_recvmsg does not always set errno. Instead it returns >>> its own custom set of error codes. Thus we were reporting the >>> wrong data. >>> --- >>> src/util/virnetlink.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c >>> index 0b36fdc..8b47ede 100644 >>> --- a/src/util/virnetlink.c >>> +++ b/src/util/virnetlink.c >>> @@ -335,8 +335,9 @@ virNetlinkEventCallback(int watch, >>> if (length == 0) >>> return; >>> if (length < 0) { >>> - virReportSystemError(errno, >>> - "%s", _("nl_recv returned with error")); >>> + virReportError(VIR_ERR_INTERNAL_ERROR, >>> + _("nl_recv returned with error: %s"), >>> + nl_geterror(length)); >> My recollection is that we specifically avoided calling nl_geterror() >> because it isn't threadsafe. >> >> I'll go take another look to verify. > I did check this, but only for libnl3 which merely does a static > string table lookup: > > const char *nl_geterror(int error) > { > error = abs(error); > > if (error > NLE_MAX) > error = NLE_FAILURE; > > return errmsg[error]; > } nl_geterror() in libnl-1.1 calls strerror() which isn't threadsafe: char *nl_geterror(void) { if (errbuf) return errbuf; if (nlerrno) return strerror(nlerrno); return "Sucess\n"; } Of course strerror is only called from here if errbuf hasn't been set, and I *think* that is rare, and we added a patch to all Fedora/RHEL builds of libnl-1.1 that put errbuf and nlerrno into thread-local storage quite a long time ago (August 2010: https://bugzilla.redhat.com/show_bug.cgi?id=617291 ). *But* the function that sets errbuf, __nl_error(), itself calls strerror(). And when I look at strerror() in glibc, I'm not exactly getting warm fuzzies about what could happen if it was called simultaneously from two threads. It mallocs a global char* (after freeing the original) then uses strerror_r to duplicate the standard system error string into the malloc'ed buffer. So it's possible for one thread to be dereferencing a pointer to a buffer that has already been free'd by another thread. Of course that's all academic, since __nl_error() is called when an error occurs anyway, regardless of whether you subsequently decide to call nl_geterror() or not. So the conclusion is that I see no extra harm in calling nl_geterror(). ACK. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list