Re: [389-users] importing large subtree crashes ns-slapd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christopher Wood wrote:
> I'm just getting started with 389 Directory Server (at work), and I've run into an issue that I'm not certain how to troubleshoot. I would greatly appreciate any assistance or tips you could offer, especially on where to look to see what's failing.
>
> Also, I apologize in advance for changing strings related to my employer's directory names and such, as I'm not comfortable with leaking that level information to a public list.
>   
As well you should be - you should always obscure sensitive information 
like this.
>
> Overview:
>
> Initializing a large subtree from NDS 6.2 crashes ns-slapd, but other subtrees are fine.
>
>
> Top-Level Questions:
>
> 1) How do I stop ns-slapd from crashing?
>   
Good question.
> 2) How do I figure out what precisely is causing the crash? (With various levels of debug logging I get the same log entry.)
>   
You've already used the TRACE level (1) for logging - that's as verbose 
as it gets for this particular operation.  Next step would be to try to 
get a core file.
> 3) Is it possible to simply import my initialization ldif without duplication checks?
>   
No.
>
> Background:
>
> At work we have NDS 6.2 (single master on a physical server, virtual machine slaves), and would like to move our directories intact to a 389 2.6 installation via replication.
>   
What platform/OS?  32-bit or 64-bit?  By NDS 6.2 I'm assuming you mean 
Netscape Directory Server - by 2.6 I'm assuming you mean 1.2.6.a1 (a2 
should be hitting the mirrors tomorrow).
> I already have replicated several of our NDS 6.2 subtrees to 389 2.6 with no difficulties.
>
> I compiled our 389 installation from the source packages downloaded from http://directory.fedoraproject.org/wiki/Source.
Did you grab 389-ds-base 1.2.6.a1 or 1.2.6.a2?

What compiler flags did you use?

Do you have a core file?  If so, try using gdb
gdb /path/to/ns-slapd /path/to/core.pid
once in gdb, type the "where" command
(gdb) where
> The underlying platform is:
>
> $ uname -a
> Linux cwlab-02.mycompany.com 2.6.18-164.el5 #1 SMP Thu Sep 3 03:33:56 EDT 2009 i686 i686 i386 GNU/Linux
> $ cat /etc/redhat-release 
> CentOS release 5.4 (Final)
>
> $ free
>              total       used       free     shared    buffers     cached
> Mem:       3894000    1336012    2557988          0     144944    1004716
> -/+ buffers/cache:     186352    3707648
> Swap:      2031608          0    2031608
>
>
> Procedure To Crash 389's ns-slapd:
>
> a) In the NDS 6.2 admin console, create a new replication agreement for the "o=This Big Net" subtree, and choose to "Create consumer initialization file".
>
> b) Copy the file to the 389 server.
>
> c) In the 389 2.6 admin console for the Directory Server, in the Configuration tab (Data -> o=This Big Net -> dbRoot), right-click and choose "Initialize Database". Use the ldif file copied over.
>
> The ns-slapd process crashes, and I always get this in /opt/dirsrv/var/log/dirsrv/slapd-cwlab-02/errors as the last two lines:
>
> [03/Mar/2010:12:50:04 -0500] - import ldapAuthRoot: Processing file "/home/cwood/tbn.ldif"
> [03/Mar/2010:12:50:04 -0500] - => str2entry_dupcheck
>
>
> Other Details:
>
>
> I found two bugs with the str2entry_dupcheck string in it, but they don't seem pertinent:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=548115
> https://bugzilla.redhat.com/show_bug.cgi?id=243488
>
>
> This says that str2entry_dupcheck could be about two things:
>
> http://docs.sun.com/source/816-6699-10/ax_errcd.html
>
> "While attempting to convert a string entry to an LDAP entry, the server found that the entry has no DN."
>
> "The server failed to add a value to the value tree."
>
> (But this is an exported database from NDS 6.2, and I'm fairly sure, without reading them all, that every entry will have a DN.)
>   
The log message
[03/Mar/2010:12:50:04 -0500] - => str2entry_dupcheck

is just trace information, not a report of a problem or error.

Does the crash happen almost immediately?  Or does it take a while?  If 
the problem happens quickly, it would be worthwhile to scan the first 
couple of dozen entries looking for things like - entries without a DN - 
attributes without a value

>
> If 389 is trying to check for duplicate entries, perhaps there are simply too many DNs?
>
> $ grep '^dn:' tbn.ldif | wc -l
> 636985
> $ ls -lh acc.ldif 
> -rw-r--r-- 1 cwood cwood 755M Mar  3 11:24 tbn.ldif
>   
No.  The server should be able to handle this much data easily.  And it 
must check for duplicate entries.
>
> Per the instructions here:
>
> http://directory.fedoraproject.org/wiki/FAQ#Troubleshooting
>
> I set my debug logging first to 24579:
>
> 1 	 Trace function calls 
> 2 	 Debug packet handling 
> 8192 	 Replication debugging 
> 16384 	 Critical messages
>
> Then for the next try at reading logs I set it to 90115, the above plus:
>
> 65536 	 Plug-in debugging
>
> However, every time the log ended with the same set of lines noted above.
>   
1 Trace is really the best for this particular problem, and as you have 
found it is limited for this particular problem.

I think the next step would be to build the server with full debugging 
information (use -g and omit -O2 or any other -Ox) and get a stack trace 
with full debug information.
> --
> 389 users mailing list
> 389-users@xxxxxxxxxxxxxxxxxxxxxxx
> https://admin.fedoraproject.org/mailman/listinfo/389-users
>   

--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users

[Index of Archives]     [Fedora Directory Users]     [Fedora Directory Devel]     [Fedora Announce]     [Fedora Legacy Announce]     [Kernel]     [Fedora Legacy]     [Share Photos]     [Fedora Desktop]     [PAM]     [Red Hat Watch]     [Red Hat Development]     [Big List of Linux Books]     [Gimp]     [Yosemite News]

  Powered by Linux