On Wed, 12 Feb 2014 09:22:19 -0800 Noriko Hosoi <nhosoi@xxxxxxxxxx> wrote: > Rich Megginson wrote: > > On 02/11/2014 10:32 PM, Timothy Pollard wrote: > >> > >> Our system: > >> $ cat /etc/redhat-release > >> CentOS release 6.5 (Final) > >> $ uname -r > >> 2.6.32-431.3.1.el6.x86_64 > >> $ rpm -q 389-ds > >> 389-ds-1.2.2-1.el6.noarch > > rpm -q 389-ds-base > > > > Are these servers running in VMs? No they're physical servers. > > > >> In case they're helpful I have stack traces from during both failed and > >> successful searches. I can send them through if they are useful. > > > > Yes. > >> Does anyone have any idea what might be causing this, or how we could go > >> about fixing it? Should we report it as a bug? > > > > Yes. https://fedorahosted.org/389/newticket > > > > You can attach scripts, logs, stack traces, etc. to the ticket. OK, I've created a new ticket and attached the stacktraces. Thanks. https://fedorahosted.org/389/ticket/47696 > Can we also have the output from "dbscan > -f /var/lib/dirsrv/slapd-ldap-04/db/userRoot/entryrdn.db4" when the problem > occurs? Oddly it didn't get broken over night, but I did copy the entryrdn.db4 file last time it broken, so I can take dbscans of the backup of the broken one and of the currently working one. Unfortunately both files are over 140MB, and the best compression I can manage gets them down to 16MB each, which is too big to attach to the ticket. Any suggestions on how to best share such large files? > > And, reindexing entryrdn is necessary for the temporary recovery? For > instance, just restarting the server does not help? I'm wondering whether > the dncache in memory is corrupted or the entryrdn index itself is ... I'm pretty sure it is; I haven't specifically tested that, but I have restarted it while it was broken to try to resolve other issues as well. > > Thanks, > --noriko I also forgot to mention a potentially related error message. Occasionally on entry deletion we see errors like this in our error log: [12/Feb/2014:20:20:46 +0000] entryrdn-index - _entryrdn_delete_key: Failed to remove ou=test; has children [12/Feb/2014:20:20:46 +0000] - database index operation failed BAD 1031, err=-1 Unknown error: -1 Since they mention the entryrdn and indexes I thought they might be related, but they don't seem to directly cause this problem, since I'm seeing these errors now, but the system is fine, but I've had it get broken before with none of these errors between the last re-index and the failure. -- TimP [http://blog.timp.com.au]
Attachment:
signature.asc
Description: PGP signature
-- 389 users mailing list 389-users@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/389-users