On Thu, Dec 22, 2011 at 4:01 PM, clemens fischer <ino-news@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > >> # ldconfig -v >> ldconfig: Can't stat /usr/lib64: No such file or directory >> /usr/lib/libfakeroot: >> libfakeroot-0.so -> libfakeroot.so >> /usr/lib/perl5/core_perl/CORE: >> libperl.so -> libperl.so >> /lib: >> Aborted > > I think there's no harm in "mkdir /usr/lib64". > > To me this sounds as if the VM balloons out of memory. How much RAM is > allocated to the VM's? (fair amount of debug output ... summary at end) yeah i originally tried upping the mem to 1024M+, preventing the balloon module from loading (since it's an opt-in kernel module), and not even using mem ballooning -- no changes at all. the /usr/lib64 stuff isn't a prob, my guess is everyone's machine does that (/lib64 is created by glibc for compat reasons only, not in filesystem package) ... ... though, after rebuilding glibc with debug syms, i was able to trace the issue. `ldconfig` is consistently receiving the correct, then incorrect(?) inode, twice(!), to an arbitrary library; `ldconfig` detects this anomaly just before adding the entry to it's aux-cache, then explicitly calls abort(). while the problem library is `libgcrypt.so.11`, it's not specific to that lib (if i remove that library, it just fails on a different one) ... possibly a pattern here but not yet sure. i ran `gdb --args ldconfig -v` (breakpoint, output, backtrace, and source context provided below): ---------------------------------------------------------------------------- Reading symbols from /sbin/ldconfig...done. (gdb) break cache.c:620 if (soname!=0x0 && strcmp(soname, "libgcrypt.so.11")==0) Breakpoint 1 at 0x402e14: file cache.c, line 620. (gdb) commands Type commands for breakpoint(s) 1, one per line. End with a line saying just "end". >silent >printf "\n---- soname: %s\n---- inode: %i\n---- hash: %i\n\n", soname, id->ino, hash >continue >end (gdb) run Starting program: /sbin/ldconfig -v /sbin/ldconfig: Can't stat /usr/lib64: No such file or directory ---- soname: libgcrypt.so.11 ---- inode: 15344348 ---- hash: 722 /usr/lib/libfakeroot: libfakeroot-0.so -> libfakeroot.so /usr/lib/perl5/core_perl/CORE: libperl.so -> libperl.so /lib: ---- soname: libgcrypt.so.11 ---- inode: 15344350 ---- hash: 834 ---- soname: libgcrypt.so.11 ---- inode: 15344350 ---- hash: 834 Program received signal SIGABRT, Aborted. 0x000000000044f4fc in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) bt #0 0x000000000044f4fc in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x000000000040c20e in abort () at abort.c:93 #2 0x0000000000402e57 in insert_to_aux_cache (id=0x7fffffffd1b0, flags=771, osversion=0, soname=0x6db360 "libgcrypt.so.11", used=1) at cache.c:625 #3 0x0000000000403dea in add_to_aux_cache (stat_buf=<optimized out>, flags=<optimized out>, osversion=<optimized out>, soname=<optimized out>) at cache.c:650 #4 0x00000000004023cd in search_dir (entry=0x6d09d0) at ldconfig.c:880 #5 0x0000000000402d09 in search_dirs () at ldconfig.c:1023 #6 main (argc=2, argv=<optimized out>) at ldconfig.c:1372 (gdb) list cache.c:620,625 620 for (entry = aux_hash[hash]; entry; entry = entry->next) 621 if (id->ino == entry->id.ino 622 && id->ctime == entry->id.ctime 623 && id->size == entry->id.size 624 && id->dev == entry->id.dev) 625 abort (); ---------------------------------------------------------------------------- ... before adding a new entry to the cache, `ldconfig` loops thru existing entries and aborts if an *exact* match is found ... and in this case there appears to somehow be 2 entries to the same library (with different inodes), the first is bogus (from VM perspective anyway) and the second is added twice, triggering the abort. i don't know if v9fs or QEMU is suppose to be changing the inode, but every file i test is "off by two", example (host/VM, resp): # stat --format="%i %n" ./lib/libgcrypt.so.11.7.0 15344348 ./lib/libgcrypt.so.11.7.0 # stat --format="%i %n" /lib/libgcrypt.so.11.7.0 15344350 /lib/libgcrypt.so.11.7.0 ... `ldconfig` is attempting to add EACH as `libgcrypt.so.11.7.0` (see gdb output)! very suspicious. the first (host?) version is somehow detected before anything in ld.so.conf.d/* is tried (and gdb confirms all other libs are found during this period as well) ... ... something is definitely wonky though, because `ldconfig` tries to add inode `15344348` as `libgcrypt.so.11.7.0`, but that is totally wrong from guest perspective: stat --format="%i %n" /lib/l* | grep 15344348 15344348 /lib/libext2fs.so.2.4 ... i don't know how the !@#$ it's getting that, but i suspect some kind of bad interaction between the host/VM page caches, or a bug in ldconfig, the v9fs kernel module, the "virtfs" server implemented within QEMU, or possibly something *very* odd about my setup. i'm also using the "mapped" virtfs option (guest perms/etc are stored in xattrs on the host) allowing QEMU to run as nobody:kvm instead of root ... could be part of the problem ... i thought this was the recommended way, perhaps not. in conclusion ... the issue is very unlikely to be Arch-specific. i'll debug a bit more, and take the information to the proper sources, but i figured i'd do a final update here for closure/interest, but will of course still gladly accepted any further advice or suggestion. thanks! -- C Anthony