On Tue, Aug 25, 2020 at 4:07 PM Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote: > > On Tue, Aug 25, 2020 at 03:16:50PM +0200, Christian Ehrhardt wrote: > > Hi, > > I expect that this falls under the "with meson now everything is > > different anyway" umbrella but wanted to let you know about this as it > > affects v6.6 in at least Ubuntu/Debian. > > > > The following recent patch has broken libvirt-lxc for us: > > commit d7147b3797380de2d159ce6324536f3e1f2d97e3 > > Author: Pavel Hrdina <phrdina@xxxxxxxxxx> > > Date: Fri Jun 19 00:44:07 2020 +0200 > > m4: virt-xdr: rewrite XDR check > > > > I was tracking that down for [1] since the tests [4] failed on me. [2] > > holds the backtrace. > > In Debian the tests are skipped which explains why they were not seen there: > > smoke-lxc SKIP Test requires machine-level isolation but testbed > > does not provide that > > > > What happens is that the libvirt_lxc segfaults when using XDR functions. > > > > dmesg shows: > > [582093.524644] libvirt_lxc[261446]: segfault at 0 ip 0000000000000000 > > sp 00007ffdd2345598 error 14 in libvirt_lxc[5587e42aa000+8000] > > [582093.524650] Code: Bad RIP value. > > > > There are quite some uncertainties left, but on the surface it seems > > that it links with libtirpc but > > then instead of calling > > libtirpc: src/xdr.c:929:xdr_uint64_t(xdrs, ullp) > > it ends (gdb tells us in [2]) in glibc > > glibc: sunrpc/xdr_intXX_t.c:62:xdr_uint64_t (XDR *xdrs, uint64_t *uip) > > > > And the return from that function breaks it badly (instruction pointer > > at 0x0 -> segfault) > > Right so that's a serious problem with clashing symbols between tirpc > and glibc. > > In Fedora/RHEL it is impossible to build against glibc for the XDR > symbols for a long time now. Glibc maintainers want everyone to be > using tirpc. The symbols are still exported from glibc, but they > should only be used by legacy apps built against older glibc. > > Symbol versioning should ensure libvirt_lxc always resolves to the > libtirpc library > > $ eu-readelf -a /usr/lib64/libc.so.6 | grep xdr_uint64 | grep GLOBAL > 2017: 00000000001349c0 226 FUNC GLOBAL DEFAULT 15 xdr_uint64_t@GLIBC_2.2.5 > > > $ eu-readelf -a /usr/lib64/libtirpc.so | grep xdr_uint64 | grep GLOBAL > 344: 000000000001ce20 9 FUNC GLOBAL DEFAULT 14 xdr_uint64_t@@TIRPC_0.3.0 > > $ eu-readelf -a /usr/libexec/libvirt_lxc | grep xdr_uint64 > 0x0000000000024a30 X86_64_JUMP_SLOT 000000000000000000 +0 xdr_uint64_t > 149: 0000000000000000 0 FUNC GLOBAL DEFAULT UNDEF xdr_uint64_t@TIRPC_0.3.0 (13) ubuntu@groovy:~$ eu-readelf -a /lib/x86_64-linux-gnu/libc.so.6 | grep xdr_uint64 | grep GLOBAL 2019: 0000000000159ed0 228 FUNC GLOBAL DEFAULT 16 xdr_uint64_t@@GLIBC_2.2.5 ubuntu@groovy:~$ eu-readelf -a /lib/x86_64-linux-gnu/libtirpc.so.3.0.0 | grep xdr_uint64 | grep GLOBAL 343: 000000000001ae20 9 FUNC GLOBAL DEFAULT 15 xdr_uint64_t@@TIRPC_0.3.0 Ubuntu v6.0 builds ubuntu@groovy:~$ eu-readelf -a /usr/lib/libvirt/libvirt_lxc | grep xdr_uint64 0x0000000000026820 X86_64_JUMP_SLOT 000000000000000000 +0 xdr_uint64_t 99: 0000000000000000 0 FUNC GLOBAL DEFAULT UNDEF xdr_uint64_t@GLIBC_2.2.5 (4) [ 1c02] xdr_uint64_t Ubuntu v6.6 builds ubuntu@groovy:~$ eu-readelf -a /usr/lib/libvirt/libvirt_lxc | grep xdr_uint64 0x00000000000268d0 X86_64_JUMP_SLOT 000000000000000000 +0 xdr_uint64_t 104: 0000000000000000 0 FUNC GLOBAL DEFAULT UNDEF xdr_uint64_t@GLIBC_2.2.5 (4) [ 1a81] xdr_uint64_t They miss the version 3.0 entry - interesting. libvirt 6.6 build from git on the same system: $ eu-readelf -a libvirt/build/src/.libs/libvirt_lxc | grep xdr_uint64 0x0000000000028968 X86_64_JUMP_SLOT 000000000000000000 +0 xdr_uint64_t 99: 0000000000000000 0 FUNC GLOBAL DEFAULT UNDEF xdr_uint64_t@GLIBC_2.2.5 (3) 598: 0000000000000000 0 FUNC GLOBAL DEFAULT UNDEF xdr_uint64_t@@GLIBC_2.2.5 [ 31df] xdr_uint64_t@@GLIBC_2.2.5 [ 18f4] xdr_uint64_t That is with configure: xdr: yes (CFLAGS='-I/usr/include/tirpc' LIBS='-ltirpc') So something is wrong at build time when glibc AND tirpc provide that symbol. > > This shows libvirt_lxc will only resolve to libtirpc. > > > I see the Ubuntu package for glibc is passing --enable-obsolete-rpc which > allows apps to continue to build against glibc for RPC :-( > > So I suspect somehow libvirt has ended up using tirpc headers, but the linker > probably resolved symbols to glibc. As I wrote above my builds don't get the 3.0 entry in libvirt_lxc which seems to be the reason to then jump to the wrong one. > I don't know how the linker decides which library to resolve symbols to > when multiple provided the same symbol with different versions. Possibly > tries in order ? I do recall that there were lots of problems with having > both glibc and libtirpc used in Fedora before glibc introduced the > abilty to disable RPC via --disable-obsolete-rpc to > > Did I mention that --enable-obsolete-rpc is a bad idea yet :-P You are probably right, but that will be a different bug for a different day. > FWIW, you're going to be forced to stop using this arg because it has been > deleted entirely in glibc 2.32, so there's no way to compile against > glibc for XDR. Only existing built binaries will work. By then at least it won't be able to link in the wrong one anymore :-) And 2.32 is planned sometime soon for Ubuntu [1], so maybe I can do the revert for a week and then drop it on a rebuild. [1]: https://discourse.ubuntu.com/t/groovy-gorilla-release-schedule/15531 > > > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| > -- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd