Re: Non-contiguous attribute values

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 11 Mar 2014 07:17:25 -0600
Rich Megginson <rmeggins@xxxxxxxxxx> wrote:
> On 03/10/2014 09:17 PM, Timothy Pollard wrote:
> > On Mon, 10 Mar 2014 20:56:08 -0600
> > Rich Megginson <rmeggins@xxxxxxxxxx> wrote:
> >> On 03/10/2014 08:42 PM, Timothy Pollard wrote:
> >>> A small update; we're now
> >> Now as opposed to some time in the past?  At what point did you begin
> >> seeing these messages, and what changed?
> > It looks like it started after I manually "fixed" the entry.
> 
> What exactly did you do to fix the entry?

I edited it and filled it what looked like the missing values (which I copied
from an old LDIF file):

dNSClass: IN
zoneName: cvsdude.com
relativeDomainName: testingstatus
objectClass: top
objectClass: dNSZone
dNSTTL: 100

> 
> > As I said it is a
> > test entry, so I'm happy to delete it entirely and recreate it if you think
> > this will fix the issue,
> 
> I don't think it will fix the issue, but it may help reproduce it more easily.
> 
> > but I can hold off on that if you'd like me to find
> > out more information.
> 
> If you are not experiencing the "non-contiguous" problem now, there's not
> much information to get.
> 

We're not seeing the non-contiguous problem any more, but we are seeing
repeated DB crashes:

[11/Mar/2014:21:57:14 +0000] - libdb: dnsRoot/id2entry.db4 page 36132 is on free list with type 5
[11/Mar/2014:21:57:14 +0000] - libdb: PANIC: Invalid argument
[11/Mar/2014:21:57:14 +0000] - libdb: PANIC: fatal region error detected; run recovery
[11/Mar/2014:21:57:14 +0000] - Serious Error---Failed in dblayer_txn_abort, err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
[11/Mar/2014:21:57:14 +0000] - libdb: PANIC: fatal region error detected; run recovery
[11/Mar/2014:21:57:14 +0000] - FATAL ERROR at idl_new.c (1); server stopping as database recovery needed.

This happens within a few minutes after every restart of the daemon. I'm not
sure if this is related though. It (the new DB error) first occurred after
ns-slapd was killed by the oom-killer. Could that cause database corruption?

It also looks like we might need to do some memory tuning on 389, is there some
suggested documentation on that, or should I just google it?

At the moment we've switched to our other master (we use a multi-master
replication setup), so we'll probably just rebuild the problem server from
there, but is there anything that I should look at to diagnose the problem first?

Thanks,
-- 
TimP
[http://blog.timp.com.au]
[http://resume.timp.com.au]

Attachment: signature.asc
Description: PGP signature

--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users

[Index of Archives]     [Fedora User Discussion]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora News]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora QA]     [Fedora Triage]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Yosemite Photos]     [Linux Apps]     [Maemo Users]     [Gnome Users]     [KDE Users]     [Fedora Tools]     [Fedora Art]     [Fedora Docs]     [Maemo Users]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Fedora ARM]

  Powered by Linux