Re: [389-users] Master caught in infinite loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



W dniu 2011-11-18 14:42, Rich Megginson pisze:
> On 11/18/2011 05:08 AM, Daniel Fenert wrote:
>> Hi,
>>
>> I'm using 389ds 1.2.5 with replication, my current setup:
>>
>> Master
>> |     \
>> L1     L2
>> | \    |  \
>> S1 S2 S3  S4
>>
>> L* - acting as slave to "master" and master to "S*"
>> S* - slaves to L*
>>
>>
>>  From time to time (usually few months between problems) we encounter
>> "master" going to some infinite loop.
>> After analyzing access log, it looks like it stops doing queries, and
>> accepts new connections until it runs out of fd's.
>> After that, it won't stop peacefully, only SIGKILL saves the day.
>>
>> Workload:
>> Master is used only for updates, maybe 20 connections/s.
>> L* are used only for replication.
>> All bind's and search queries are targeted to S* which are read only.
>>
>> With previous setup (less complicated), we've also seen this problem:
>> Master
>> |  |  |  \
>> S1 S2 S3  S4...
>>
>> Is there a chance that upgrading to latest version will fix the problem?
>> Were there any fixes nearby? Upgrade will be complex as hell ;)
>>
>> Error log from last problem:
>>   - Not listening for new connections - too many fds open
> Have you tried increasing the number of fds to 8192?

Yes, but it doesn't make sense - during normal operation master uses no 
more than 50-60 fd's.

>>   - slapd shutting down - signaling operation threads
>>   - slapd shutting down - waiting for 120 threads to terminate
> Does the server shutdown on its own, or did you shut it down normally 
> (i.e. service dirsrv stop)?

We have tried to stop it using init.d scripts.

>> ... SIGKILL ...
>>   - 389-Directory/1.2.5 B2010.012.2034 starting up
>>   - Detected Disorderly Shutdown last time Directory Server was running,
>> recovering database.
>>   - slapd started.  Listening on All Interfaces port 389 for LDAP 
>> requests
>>
>> Number of fds: 4096.
> Since 1.2.5 we have fixed a number of bugs around connection 
> handling.  You might find that 1.2.9.9 (current stable version) works 
> much better for you.

OK, we'll try to upgrade.

How to upgrade such complex setup?
Should we try top-to-bottom approach (master first, then L*, then S*) or 
bottom-to-top (S*, L*, master last)?
Shutting down all servers is not really an option.

-- 
Daniel Fenert
--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users



[Index of Archives]     [Fedora User Discussion]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora News]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora QA]     [Fedora Triage]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Yosemite Photos]     [Linux Apps]     [Maemo Users]     [Gnome Users]     [KDE Users]     [Fedora Tools]     [Fedora Art]     [Fedora Docs]     [Maemo Users]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Fedora ARM]

  Powered by Linux