Re: ldconfig -> Aborted.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Thu, Dec 22, 2011 at 4:01 PM, clemens fischer
<ino-news@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>> # ldconfig -v
>> ldconfig: Can't stat /usr/lib64: No such file or directory
>> /usr/lib/libfakeroot:
>>        libfakeroot-0.so -> libfakeroot.so
>> /usr/lib/perl5/core_perl/CORE:
>>        libperl.so -> libperl.so
>> /lib:
>> Aborted
>
> I think there's no harm in "mkdir /usr/lib64".
>
> To me this sounds as if the VM balloons out of memory.  How much RAM is
> allocated to the VM's?

(fair amount of debug output ... summary at end)

yeah i originally tried upping the mem to 1024M+, preventing the
balloon module from loading (since it's an opt-in kernel module), and
not even using mem ballooning -- no changes at all.  the /usr/lib64
stuff isn't a prob, my guess is everyone's machine does that (/lib64
is created by glibc for compat reasons only, not in filesystem
package) ...

... though, after rebuilding glibc with debug syms, i was able to
trace the issue.  `ldconfig` is consistently receiving the correct,
then incorrect(?) inode, twice(!), to an arbitrary library; `ldconfig`
detects this anomaly just before adding the entry to it's aux-cache,
then explicitly calls abort().

while the problem library is `libgcrypt.so.11`, it's not specific to
that lib (if i remove that library, it just fails on a different one)
... possibly a pattern here but not yet sure.

i ran `gdb --args ldconfig -v` (breakpoint, output, backtrace, and
source context provided below):

----------------------------------------------------------------------------
Reading symbols from /sbin/ldconfig...done.
(gdb) break cache.c:620 if (soname!=0x0 && strcmp(soname, "libgcrypt.so.11")==0)
Breakpoint 1 at 0x402e14: file cache.c, line 620.
(gdb) commands
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>silent
>printf "\n---- soname: %s\n---- inode:  %i\n---- hash:   %i\n\n", soname, id->ino, hash
>continue
>end
(gdb) run
Starting program: /sbin/ldconfig -v
/sbin/ldconfig: Can't stat /usr/lib64: No such file or directory

---- soname: libgcrypt.so.11
---- inode:  15344348
---- hash:   722

/usr/lib/libfakeroot:
        libfakeroot-0.so -> libfakeroot.so
/usr/lib/perl5/core_perl/CORE:
        libperl.so -> libperl.so
/lib:

---- soname: libgcrypt.so.11
---- inode:  15344350
---- hash:   834


---- soname: libgcrypt.so.11
---- inode:  15344350
---- hash:   834


Program received signal SIGABRT, Aborted.
0x000000000044f4fc in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
64        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x000000000044f4fc in raise (sig=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x000000000040c20e in abort () at abort.c:93
#2  0x0000000000402e57 in insert_to_aux_cache (id=0x7fffffffd1b0,
flags=771, osversion=0, soname=0x6db360 "libgcrypt.so.11", used=1) at
cache.c:625
#3  0x0000000000403dea in add_to_aux_cache (stat_buf=<optimized out>,
flags=<optimized out>, osversion=<optimized out>, soname=<optimized
out>) at cache.c:650
#4  0x00000000004023cd in search_dir (entry=0x6d09d0) at ldconfig.c:880
#5  0x0000000000402d09 in search_dirs () at ldconfig.c:1023
#6  main (argc=2, argv=<optimized out>) at ldconfig.c:1372
(gdb) list cache.c:620,625
620       for (entry = aux_hash[hash]; entry; entry = entry->next)
621         if (id->ino == entry->id.ino
622             && id->ctime == entry->id.ctime
623             && id->size == entry->id.size
624             && id->dev == entry->id.dev)
625           abort ();
----------------------------------------------------------------------------

... before adding a new entry to the cache, `ldconfig` loops thru
existing entries and aborts if an *exact* match is found ... and in
this case there appears to somehow be 2 entries to the same library
(with different inodes), the first is bogus (from VM perspective
anyway) and the second is added twice, triggering the abort.

i don't know if v9fs or QEMU is suppose to be changing the inode, but
every file i test is "off by two", example (host/VM, resp):

# stat --format="%i %n" ./lib/libgcrypt.so.11.7.0
15344348 ./lib/libgcrypt.so.11.7.0

# stat --format="%i %n" /lib/libgcrypt.so.11.7.0
15344350 /lib/libgcrypt.so.11.7.0

... `ldconfig` is attempting to add EACH as `libgcrypt.so.11.7.0` (see
gdb output)!  very suspicious.  the first (host?) version is somehow
detected before anything in ld.so.conf.d/* is tried (and gdb confirms
all other libs are found during this period as well) ...

... something is definitely wonky though, because `ldconfig` tries to
add inode `15344348` as `libgcrypt.so.11.7.0`, but that is totally
wrong from guest perspective:

stat --format="%i %n" /lib/l* | grep 15344348
15344348 /lib/libext2fs.so.2.4

... i don't know how the !@#$ it's getting that, but i suspect some
kind of bad interaction between the host/VM page caches, or a bug in
ldconfig, the v9fs kernel module, the "virtfs" server implemented
within QEMU, or possibly something *very* odd about my setup. i'm also
using the "mapped" virtfs option (guest perms/etc are stored in xattrs
on the host) allowing QEMU to run as nobody:kvm instead of root ...
could be part of the problem ... i thought this was the recommended
way, perhaps not.

in conclusion ... the issue is very unlikely to be Arch-specific. i'll
debug a bit more, and take the information to the proper sources, but
i figured i'd do a final update here for closure/interest, but will of
course still gladly accepted any further advice or suggestion.

thanks!

-- 

C Anthony


[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]
  Powered by Linux