Richard Megginson wrote: > Del wrote: >> >> Hi, >> >> Following an earlier suggestion on this thread, I have tried to get FDS >> running on a Fedora 7 box using the binary RPM from the download area >> for Fedora Core 6. >> >> The directory server appears to run fine, but the admin server just spews >> a torrent of log messages saying: >> >> [Wed Aug 08 18:00:07 2007] [notice] child pid 19260 exit signal >> Segmentation fault (11) >> ... etc. > I'm not sure what the problem is. I just downloaded FDS 1.0.4 for FC6 > x86_64 and installed on a vmware instance of F7 x86_64. The F7 system > has the latest updates as of today. It works fine. I ran setup, just > accepted the defaults, setup completed and started the admin server. I > don't have java installed on the system, but I was able to use the web > interface to run several of the CGIs. I have no problems. >> >> Has anyone else seen this and can anyone offer any suggestions as to how >> to get it going? It's quite tricky to run strace / gdb on the httpd >> binary >> as all I get is as far as the fork, and it appears to be the httpd.worker >> child processes that are dying. > strace -f will follow forks (-ff to write each process output to > separate files), and gdb has a mode to follow forks as well. I've been working on this for some weeks now with no success. I have one server which has been upgraded from FC6 to FC7 and it works fine. I have another server which is a new FC7 install and it fails. Both are similar architecture, x86 32 bit. The strace -ff output shows this on each process on the machine where it is failing: open("tls/i686/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/sse2/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("tls/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("i686/sse2/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("i686/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("sse2/libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("libnsl.so.1", O_RDONLY) = -1 ENOENT (No such file or directory) open("/lib/libnsl.so.1", O_RDONLY) = 30 read(30, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\341\211"..., 512) = 512 fstat64(30, {st_mode=S_IFREG|0755, st_size=109732, ...}) = 0 mmap2(NULL, 100296, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 30, 0) = 0x50a32000 mmap2(0x50a47000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 30, 0x14) = 0x50a47000 mmap2(0x50a49000, 6088, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x50a49000 close(30) = 0 mprotect(0x50a47000, 4096, PROT_READ) = 0 munmap(0xb7282000, 109394) = 0 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0 geteuid32() = 0 futex(0x5defa564, FUTEX_WAKE, 2147483647) = 0 open("/etc/ldap.conf", O_RDONLY) = 30 fstat64(30, {st_mode=S_IFREG|0644, st_size=9020, ...}) = 0 fstat64(30, {st_mode=S_IFREG|0644, st_size=9020, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f5d000 read(30, "# @(#)$Id: ldap.conf,v 1.38 2006"..., 4096) = 4096 read(30, "assword ad\n\n# Use the OpenLDAP p"..., 4096) = 4096 read(30, " for 2.1 and later is \"yes\".\n#tl"..., 4096) = 828 read(30, "", 4096) = 0 close(30) = 0 munmap(0xb7f5d000, 4096) = 0 uname({sys="Linux", node="marvin.babel.office", ...}) = 0 open("/etc/hosts", O_RDONLY) = 30 fcntl64(30, F_GETFD) = 0 fcntl64(30, F_SETFD, FD_CLOEXEC) = 0 fstat64(30, {st_mode=S_IFREG|0644, st_size=278, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f5d000 read(30, "# Do not remove the following li"..., 4096) = 278 close(30) = 0 munmap(0xb7f5d000, 4096) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ So it looks like it's attempting a connection to the LDAP server in NSS_LDAP somewhere, possibly looking for the current uid, and then looking in /etc/hosts for the current host name. /etc/ldap.conf contains the IP address of both LDAP servers. /etc/hosts in the current case looks like this: 127.0.0.1 localhost.localdomain localhost 192.168.110.52 marvin.babel.office marvin 192.168.110.42 fortytwo.babel.office fortytwo All of these IP addresses are also mapped (and reverse mapped) in the local DNS. Everything else on these systems works normally -- internet access, web browsing, sendmail, etc, all of the stuff that would normally use /etc/hosts and/or DNS. I've checked the systems over fairly extensively. I can't think of why the admin server is failing at this point. Anything I should go looking for next? On the machine where the admin server is not failing -- the strace output looks completely different. It doesn't appear to be doing any NSS/DNS/etc/hosts lookups at all. -- Del Babel Com Australia http://www.babel.com.au/ ph: 02 9368 0728 fax: 02 9368 0758