W dniu 2011-11-18 14:42, Rich Megginson pisze: > On 11/18/2011 05:08 AM, Daniel Fenert wrote: >> Hi, >> >> I'm using 389ds 1.2.5 with replication, my current setup: >> >> Master >> | \ >> L1 L2 >> | \ | \ >> S1 S2 S3 S4 >> >> L* - acting as slave to "master" and master to "S*" >> S* - slaves to L* >> >> >> From time to time (usually few months between problems) we encounter >> "master" going to some infinite loop. >> After analyzing access log, it looks like it stops doing queries, and >> accepts new connections until it runs out of fd's. >> After that, it won't stop peacefully, only SIGKILL saves the day. >> >> Workload: >> Master is used only for updates, maybe 20 connections/s. >> L* are used only for replication. >> All bind's and search queries are targeted to S* which are read only. >> >> With previous setup (less complicated), we've also seen this problem: >> Master >> | | | \ >> S1 S2 S3 S4... >> >> Is there a chance that upgrading to latest version will fix the problem? >> Were there any fixes nearby? Upgrade will be complex as hell ;) >> >> Error log from last problem: >> - Not listening for new connections - too many fds open > Have you tried increasing the number of fds to 8192? Yes, but it doesn't make sense - during normal operation master uses no more than 50-60 fd's. >> - slapd shutting down - signaling operation threads >> - slapd shutting down - waiting for 120 threads to terminate > Does the server shutdown on its own, or did you shut it down normally > (i.e. service dirsrv stop)? We have tried to stop it using init.d scripts. >> ... SIGKILL ... >> - 389-Directory/1.2.5 B2010.012.2034 starting up >> - Detected Disorderly Shutdown last time Directory Server was running, >> recovering database. >> - slapd started. Listening on All Interfaces port 389 for LDAP >> requests >> >> Number of fds: 4096. > Since 1.2.5 we have fixed a number of bugs around connection > handling. You might find that 1.2.9.9 (current stable version) works > much better for you. OK, we'll try to upgrade. How to upgrade such complex setup? Should we try top-to-bottom approach (master first, then L*, then S*) or bottom-to-top (S*, L*, master last)? Shutting down all servers is not really an option. -- Daniel Fenert -- 389 users mailing list 389-users@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/389-users