Re: [go PATCH 00/37] Fix error reporting thread safety wrt goroutine rescheduling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This series has been tested by the KubeVirt devs who confirmed it fixes
the problems they see, so I'll push it shortly.

On Mon, Jul 16, 2018 at 02:23:46PM +0100, Daniel P. Berrangé wrote:
> The Libvirt C API provides the virGetLastError() function to let callers
> aquire the error information after a function fails. This function uses
> a thread local, and so must be called in the same OS thread as the
> function that  failed originally.
> eg you could call
> 
>     char *xml = virDomainGetXMLDesc(dom);
>     if (!xml) {
>         virErrorPtr err = virGetLastError();
>         ....do stuff with err...
>     }
> 
> This is all fine, but there is a subtle problem that was overlooked when
> the Go bindings were first created. Specifically a native C API call is
> a goroutine re-scheduling point. So when the Go code does
> 
>     var xml = C.virDomainGetXMLDesc(dom);
>     if (xml == nil) {
>         C.virErrorPtr err = C.virGetLastError();
>         ....do stuff with err...
>     }
> 
> All that code runs in the same goroutine, but at the call entry to
> C.virGetLastError, the goroutine might get switched to a different
> OS level thread. As a result virGetLastError() will return either no
> error at all, or an error from a completely different libvirt API call.
> 
> We need to prevent the OS level thread being changed in between the call
> to the real function and the virGetLastError() function.
> 
> Naively you might think we could put a LockOSThread() / UnlockOSThread()
> call around this block of Go code, but that is a very bad idea in
> reality. Until Go 1.10, the LockOSThread() calls did not ref count, so
> if some other code has already locked the thread, when libvirt called
> UnlockOSThread it could do bad things. In addition, after calling
> UnlockOSThread() the Go runtime doesn't trust the OS thread state
> anymore, so will terminate the thread and spawn a new one. IOW using
> LockOSThread() would mean every single libvirt API call would create and
> destroy a new thread which is horrible for performance.
> 
> Thus this patch series takes a different approach. We create a wrapper
> function for every C API exposed by libvirt, that has a 'virErrorPtr'
> parameter. So the Go code can do
> 
>      var C.virErrorPtr err
>      var xml = C.virDomainGetXMLDescWrapper(dom, &err)
>      if (xml == nil) {
>          ...do stuff with err...
>      }
> 
> The wrapper function is responsible for calling virGetLastError() and
> since this is C code, we're guaranteed its all in the same OS level
> thread.
> 
> Daniel P. Berrangé (37):
>   error: add helper for converting libvirt to go error objects
>   storage volume: add missin blank line
>   Rename *cfuncs.{go,h} to *wrapper.{go,h}
>   Use "Wrapper" or "Helper" as suffix for C functions
>   Change "Compat" suffix to "Wrapper" to have standard naming scheme
>   connect: move wrapper functions out of compat header
>   network: move wrapper functions out of compat header
>   nwfilter binding: move wrapper functions out of compat header
>   node device: move wrapper functions out of compat header
>   secret: move wrapper functions out of compat header
>   stream: move wrapper functions out of compat header
>   storage volume: move wrapper functions out of compat header
>   storage pool: move wrapper functions out of compat header
>   qemu: move wrapper functions out of compat header
>   lxc: move wrapper functions out of compat header
>   domain: move wrapper functions out of compat header
>   make the XXX_wrapper.h header files self-contained
>   Add XXX_wrapper.{h,go} for every remaining file
>   Standardize formatting in all wrapper headers
>   storage vol: fix error reporting thread safety
>   storage pool: fix error reporting thread safety
>   stream: fix error reporting thread safety
>   secret: fix error reporting thread safety
>   nwfilter: fix error reporting thread safety
>   nwfilter binding: fix error reporting thread safety
>   node device: fix error reporting thread safety
>   network: fix error reporting thread safety
>   interface: fix error reporting thread safety
>   domain snapshot: fix error reporting thread safety
>   connect: fix error reporting thread safety
>   domain: fix error reporting thread safety
>   qemu: fix error reporting thread safety
>   lxc: fix error reporting thread safety
>   events: fix error reporting thread safety
>   error: remove GetLastError() function
>   error: make GetNotImplementedError private
>   connect: add missing references on domain object in stats records
> 
>  api_test.go                                   |    5 +-
>  callbacks.go                                  |    4 +-
>  callbacks_cfuncs.go => callbacks_wrapper.go   |    6 +-
>  callbacks_cfuncs.h => callbacks_wrapper.h     |    8 +-
>  connect.go                                    |  741 +++---
>  connect_cfuncs.h                              |   34 -
>  connect_compat.go                             |  206 --
>  connect_compat.h                              |   81 -
>  connect_wrapper.go                            | 1766 +++++++++++++
>  connect_wrapper.h                             |  730 ++++++
>  domain.go                                     | 1083 ++++----
>  domain_compat.go                              |  384 ---
>  domain_compat.h                               |  141 -
>  domain_events.go                              |  235 +-
>  domain_events_cfuncs.h                        |  124 -
>  ...ents_cfuncs.go => domain_events_wrapper.go |   85 +-
>  domain_events_wrapper.h                       |  207 ++
>  domain_snapshot.go                            |   65 +-
>  domain_snapshot_wrapper.go                    |  215 ++
>  domain_snapshot_wrapper.h                     |  102 +
>  domain_wrapper.go                             | 2330 +++++++++++++++++
>  domain_wrapper.h                              |  986 +++++++
>  error.go                                      |   17 +-
>  error_test.go                                 |   44 -
>  events.go                                     |   45 +-
>  events_cfuncs.h                               |   39 -
>  events_cfuncs.go => events_wrapper.go         |   89 +-
>  events_wrapper.h                              |   82 +
>  interface.go                                  |   48 +-
>  interface_wrapper.go                          |  158 ++
>  interface_wrapper.h                           |   76 +
>  lxc.go                                        |   27 +-
>  lxc_compat.go                                 |   51 -
>  lxc_wrapper.go                                |  101 +
>  lxc_compat.h => lxc_wrapper.h                 |   33 +-
>  network.go                                    |   88 +-
>  network_compat.h                              |   10 -
>  network_events.go                             |   23 +-
>  network_events_cfuncs.h                       |   38 -
>  ...nts_cfuncs.go => network_events_wrapper.go |   41 +-
>  network_compat.go => network_events_wrapper.h |   51 +-
>  network_wrapper.go                            |  267 ++
>  network_wrapper.h                             |  119 +
>  node_device.go                                |   66 +-
>  node_device_compat.go                         |   46 -
>  node_device_compat.h                          |    3 -
>  node_device_events.go                         |   32 +-
>  node_device_events_cfuncs.h                   |   40 -
>  ...cfuncs.go => node_device_events_wrapper.go |   46 +-
>  ..._cfuncs.go => node_device_events_wrapper.h |   73 +-
>  node_device_wrapper.go                        |  184 ++
>  node_device_wrapper.h                         |   88 +
>  nwfilter.go                                   |   38 +-
>  nwfilter_binding.go                           |   46 +-
>  nwfilter_binding_compat.h                     |   11 -
>  ...g_compat.go => nwfilter_binding_wrapper.go |   66 +-
>  nwfilter_binding_wrapper.h                    |   60 +
>  nwfilter_wrapper.go                           |  122 +
>  nwfilter_wrapper.h                            |   65 +
>  qemu.go                                       |   43 +-
>  qemu_cfuncs.go                                |   64 -
>  qemu_cfuncs.h                                 |   39 -
>  qemu_compat.go                                |   48 -
>  qemu_compat.h                                 |    3 -
>  qemu_wrapper.go                               |  133 +
>  qemu_wrapper.h                                |   78 +
>  secret.go                                     |   54 +-
>  secret_compat.h                               |    3 -
>  secret_events.go                              |   34 +-
>  secret_events_cfuncs.h                        |   40 -
>  ...ents_cfuncs.go => secret_events_wrapper.go |   45 +-
>  secret_compat.go => secret_events_wrapper.h   |   41 +-
>  secret_wrapper.go                             |  175 ++
>  secret_wrapper.h                              |   86 +
>  storage_pool.go                               |  121 +-
>  storage_pool_compat.h                         |    4 -
>  storage_pool_events.go                        |   34 +-
>  storage_pool_events_cfuncs.h                  |   40 -
>  ...funcs.go => storage_pool_events_wrapper.go |   46 +-
>  ...compat.go => storage_pool_events_wrapper.h |   43 +-
>  storage_pool_wrapper.go                       |  343 +++
>  storage_pool_wrapper.h                        |  150 ++
>  storage_volume.go                             |   86 +-
>  storage_volume_compat.go                      |   47 -
>  storage_volume_compat.h                       |    4 -
>  storage_volume_wrapper.go                     |  249 ++
>  storage_volume_wrapper.h                      |  116 +
>  stream.go                                     |   95 +-
>  stream_cfuncs.go                              |  132 -
>  stream_cfuncs.h                               |   37 -
>  stream_compat.go                              |   69 -
>  stream_compat.h                               |   13 -
>  stream_wrapper.go                             |  331 +++
>  stream_wrapper.h                              |  120 +
>  94 files changed, 11619 insertions(+), 3318 deletions(-)
>  rename callbacks_cfuncs.go => callbacks_wrapper.go (90%)
>  rename callbacks_cfuncs.h => callbacks_wrapper.h (87%)
>  delete mode 100644 connect_cfuncs.h
>  delete mode 100644 connect_compat.go
>  create mode 100644 connect_wrapper.go
>  create mode 100644 connect_wrapper.h
>  delete mode 100644 domain_compat.go
>  delete mode 100644 domain_events_cfuncs.h
>  rename domain_events_cfuncs.go => domain_events_wrapper.go (75%)
>  create mode 100644 domain_events_wrapper.h
>  create mode 100644 domain_snapshot_wrapper.go
>  create mode 100644 domain_snapshot_wrapper.h
>  create mode 100644 domain_wrapper.go
>  create mode 100644 domain_wrapper.h
>  delete mode 100644 error_test.go
>  delete mode 100644 events_cfuncs.h
>  rename events_cfuncs.go => events_wrapper.go (72%)
>  create mode 100644 events_wrapper.h
>  create mode 100644 interface_wrapper.go
>  create mode 100644 interface_wrapper.h
>  delete mode 100644 lxc_compat.go
>  create mode 100644 lxc_wrapper.go
>  rename lxc_compat.h => lxc_wrapper.h (53%)
>  delete mode 100644 network_events_cfuncs.h
>  rename network_events_cfuncs.go => network_events_wrapper.go (59%)
>  rename network_compat.go => network_events_wrapper.h (57%)
>  create mode 100644 network_wrapper.go
>  create mode 100644 network_wrapper.h
>  delete mode 100644 node_device_compat.go
>  delete mode 100644 node_device_events_cfuncs.h
>  rename node_device_events_cfuncs.go => node_device_events_wrapper.go (59%)
>  rename connect_cfuncs.go => node_device_events_wrapper.h (52%)
>  create mode 100644 node_device_wrapper.go
>  create mode 100644 node_device_wrapper.h
>  rename nwfilter_binding_compat.go => nwfilter_binding_wrapper.go (54%)
>  create mode 100644 nwfilter_binding_wrapper.h
>  create mode 100644 nwfilter_wrapper.go
>  create mode 100644 nwfilter_wrapper.h
>  delete mode 100644 qemu_cfuncs.go
>  delete mode 100644 qemu_cfuncs.h
>  delete mode 100644 qemu_compat.go
>  create mode 100644 qemu_wrapper.go
>  create mode 100644 qemu_wrapper.h
>  delete mode 100644 secret_events_cfuncs.h
>  rename secret_events_cfuncs.go => secret_events_wrapper.go (61%)
>  rename secret_compat.go => secret_events_wrapper.h (54%)
>  create mode 100644 secret_wrapper.go
>  create mode 100644 secret_wrapper.h
>  delete mode 100644 storage_pool_events_cfuncs.h
>  rename storage_pool_events_cfuncs.go => storage_pool_events_wrapper.go (60%)
>  rename storage_pool_compat.go => storage_pool_events_wrapper.h (51%)
>  create mode 100644 storage_pool_wrapper.go
>  create mode 100644 storage_pool_wrapper.h
>  delete mode 100644 storage_volume_compat.go
>  create mode 100644 storage_volume_wrapper.go
>  create mode 100644 storage_volume_wrapper.h
>  delete mode 100644 stream_cfuncs.go
>  delete mode 100644 stream_cfuncs.h
>  delete mode 100644 stream_compat.go
>  create mode 100644 stream_wrapper.go
>  create mode 100644 stream_wrapper.h
> 
> -- 
> 2.17.1
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux