Re: [PATCH] Fix error report from nl_recvmsg

"Daniel P. Berrange" <berrange@xxxxxxxxxx> · Fri, 1 Mar 2013 10:18:23 +0000



On Thu, Feb 28, 2013 at 11:59:42AM -0500, Laine Stump wrote:
> On 02/28/2013 11:16 AM, Daniel P. Berrange wrote:
> > On Thu, Feb 28, 2013 at 11:11:53AM -0500, Laine Stump wrote:
> >> On 02/28/2013 08:37 AM, Daniel P. Berrange wrote:
> >>> From: "Daniel P. Berrange" <berrange@xxxxxxxxxx>
> >>>
> >>> The nl_recvmsg does not always set errno. Instead it returns
> >>> its own custom set of error codes. Thus we were reporting the
> >>> wrong data.
> >>> ---
> >>>  src/util/virnetlink.c | 5 +++--
> >>>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/src/util/virnetlink.c b/src/util/virnetlink.c
> >>> index 0b36fdc..8b47ede 100644
> >>> --- a/src/util/virnetlink.c
> >>> +++ b/src/util/virnetlink.c
> >>> @@ -335,8 +335,9 @@ virNetlinkEventCallback(int watch,
> >>>      if (length == 0)
> >>>          return;
> >>>      if (length < 0) {
> >>> -        virReportSystemError(errno,
> >>> -                             "%s", _("nl_recv returned with error"));
> >>> +        virReportError(VIR_ERR_INTERNAL_ERROR,
> >>> +                       _("nl_recv returned with error: %s"),
> >>> +                       nl_geterror(length));
> >> My recollection is that we specifically avoided calling nl_geterror()
> >> because it isn't threadsafe.
> >>
> >> I'll go take another look to verify.
> > I did check this, but only for libnl3 which merely does a static
> > string table lookup:
> >
> > const char *nl_geterror(int error)
> > {
> >         error = abs(error);
> >
> >         if (error > NLE_MAX)
> >                 error = NLE_FAILURE;
> >
> >         return errmsg[error];
> > }
> 
> nl_geterror() in libnl-1.1 calls strerror() which isn't threadsafe:
> 
> 
>   char *nl_geterror(void)
>   {
>       if (errbuf)
>           return errbuf;
> 
>       if (nlerrno)
>           return strerror(nlerrno);
> 
>       return "Sucess\n";
>   }
> 
> Of course strerror is only called from here if errbuf hasn't been set,
> and I *think* that is rare, and we added a patch to all Fedora/RHEL
> builds of libnl-1.1 that put errbuf and nlerrno into thread-local
> storage quite a long time ago (August 2010:
> https://bugzilla.redhat.com/show_bug.cgi?id=617291 ).
> 
> *But* the function that sets errbuf, __nl_error(), itself calls
> strerror(). And when I look at strerror() in glibc, I'm not exactly
> getting warm fuzzies about what could happen if it was called
> simultaneously from two threads. It mallocs a global char* (after
> freeing the original) then uses strerror_r to duplicate the standard
> system error string into the malloc'ed buffer. So it's possible for one
> thread to be dereferencing a pointer to a buffer that has already been
> free'd by another thread.
> 
> Of course that's all academic, since __nl_error() is called when an
> error occurs anyway, regardless of whether you subsequently decide to
> call nl_geterror() or not.
> 
> So the conclusion is that I see no extra harm in calling nl_geterror(). ACK.

Except the API signature is different, so my patch won't work with
both versions :-(

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list