On Fri, 2013-01-04 at 11:07 -0500, Mark Salter wrote: > On Thu, 2013-01-03 at 19:05 -0700, Al Stone wrote: > > On 01/03/2013 06:09 PM, Mark Salter wrote: > > > On Thu, 2013-01-03 at 14:02 -0700, Al Stone wrote: > > >> The redhat-rpm-config and rpm packages build, but Mark Salter and > > >> Jon Masters will need to put their heads together to figure out > > >> what needs changing so that they'll work properly for aarch64. > > > > > > I'm trying the following for rpm (along with updated config.guess/sub). > > > It builds but install step hangs in installplatform when it calls the > > > build dir rpm. I'm trying to sort that out now. > > > > > > > Cool. Yeah, I ran into that, too. Everything seems to have > > built properly, and then it just hangs. If there's something > > I can help with to debug this, just holler. > > RPM is making a call to NSS_NoDB_Init() (libnss3.so) which never > returns. Using LD_DEBUG=all it looks like it may actually get stuck > in libfreebl3.so init, but I'm not sure. GDB would help, but it > segfaults in my branch. I may take a step back and rebase on the > master branch now that it has all of the package builds I did on > my branch. Thought I'd post some status on this. I made a simple program with just the NSS_NoDB_Init() call in it and it hangs as well. I *really* needed gdb to help debug this but gdb would segfault immediately while starting up. So I got sidetracked looking at that problem. The immediate gdb segfault turned out to be another of the mysterious make/shell problems seen while building earlier packages. These problems were usually in non-trivial make recipes used during install. In the gdb case, there is a make rule used to generate an init.c file which has a single initialization function which calls out to init functions found in various other source files. The init.c generated during the gdb build only had a couple default (always present) init calls in it. This left most of gdb uninitialized and the segfault was caused by an unitialized pointer dereference. I ran the script to generate init.c outside of make and it created a reasonable looking file. I recompiled it and relinked gdb. This got me further, but it hit an internal error while still starting up: "_initialize_gdb_osabi: gdb_osabi_names[] is inconsistent". This turned out to be a problem in one of the aarch64 patches which added a "Newlib" entry to gdb_osabi_names[] but didn't update the enum of array indexes. GDB noticed the inconsistency and issued the internal error because of it. Removing the Newlib entry allowed gdb to at least finish initializing, read my NSS_NoDB_Init program and set a breakpoint at main. But when I tried to run it: (gdb) b main Breakpoint 1 at 0x400858: file nss_nodb_init.c, line 6. (gdb) run Starting program: /stage2/nss_nodb_init Failed to read a valid object file image from memory. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/libthread_db.so.1". Cannot find user-level thread for LWP 983: capability not available That last error looks like a problem in libthread_db (glibc) or maybe kernel. I didn't have the glibc sources handy, so I abandoned gdb for printfs in the nss library. To make a long story short (and it would have been even shorter with a working gdb), the NSS_NoDB_Init() hang is in RNG_FileUpdate() in libfreebl3. This is called repeatedly for a number of files listed in an internal array and data from those files is used to update the random number generator. The list of files is: static const char * const files[] = { "/etc/passwd", "/etc/utmp", "/tmp", "/var/tmp", "/usr/tmp", 0 }; RNG_FileUpdate() uses fread() in a loop until a certain amount of data is read or eof. With the code instrumented with fprintfs, I see data from /etc/passwd being read, /etc/utmp being skipped because it doesn't exist and then fread never returns for /tmp. I played around with the ordering of the list but that didn't matter. I would see fread hang for any file which was a directory. So maybe a libc or kernel problem. For grins, I wrote a test program using fread on /tmp and that did succeed. So I'm not sure what is going on. I wanted to give rpm a try (which was where I started) so I just commented out the directories in the files list and rebuilt libfreebl3 and installed it. Yay, that fixed the rpm hang. Feeling good about that level of success, I tried "rpm --initdb" but that failed with a "file not found" error. Turns out that a number of rpm binaries including /usr/bin/rpmdb didn't get installed. This turned out to be another make/shell problem in the install-binPROGRAMS rule. That rule takes a list of binaries and installs them, but only the first in the list is getting installed. the basic flow is: list='x y z' ; for f in $list ; echo $f $f ; done ; \ while read foo bar ; do echo $foo $bar ; done What I see is that the for loop runs through all of the elements of list but the while loop stops after reading the first one. So a handful of rpm binaries didn't get installed. I made a test Makefile with one rule which used the same script as the rpm Makefile. It worked. Weird. Anyway, I manually installed the missing binaries and was able to init the db, install source rpm, query a binary rpm and other such simple things. So that's where I am right now. Still no patches to work around the make/shell issues in gdb or rpm. I think the gdb/libthread_db problem needs fixing the most. Time spent on it will make debugging other problems way easier. --Mark _______________________________________________ arm mailing list arm@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/arm