Re: Error code 51 and replication errors

Rich Megginson <rmeggins@xxxxxxxxxx> · Wed, 22 Oct 2014 10:14:37 -0600



    On 10/22/2014 10:10 AM, Shilen Patel
      wrote:

    
        389-ds-base-1.2.11.32-1.el6.x86_64
      
    
    I would strongly encourage you to use the version provided with EL
    6.6, which is 389-ds-base-1.2.11.15-47.  It looks like you are using
    a build from the old rmeggins repo or the newer copr repo.  These
    are really only for those users who needed critical fixes or
    features not yet in the "supported" EL6.6 version.  I don't know if
    that will fix your problem, but it will make it a lot easier to
    support.

    
      Thanks!
      

      —
        Shilen
      
        
          From: Rich Megginson
          <rmeggins@xxxxxxxxxx>

          Reply-To: "389-users@xxxxxxxxxxxxxxxxxxxxxxx"
          <389-users@xxxxxxxxxxxxxxxxxxxxxxx>

          Date: Wednesday,
          October 22, 2014 at 12:07 PM

          To: "389-users@xxxxxxxxxxxxxxxxxxxxxxx"
          <389-users@xxxxxxxxxxxxxxxxxxxxxxx>

          Subject: Re:
           Error code 51 and replication errors

        
              On 10/22/2014 09:54 AM,
                Shilen Patel wrote:

              
                Hi,
                

                I’m running 1.2.11.32.
              
              
              What is output of rpm -q 389-ds-base?

              
                I have 6 replicas (two of which are read-only).  I
                  ran into an issue where a DELETE operation failed on a
                  server with error code 51 (ldap busy).
                

                  [21/Oct/2014:23:44:44 -0400]
                    conn=78160 op=39510 RESULT err=51 tag=107 nentries=0
                    etime=3 csn=5447282c000300050000
                
                
                The application retried the delete several times
                  for a couple of hours (while the server wasn’t getting
                  any other requests) and the result was always the same
                  (err=51).  Each time that happened, the error log had
                  the following:
                

                  [21/Oct/2014:23:44:44 -0400] -
                    Retry count exceeded in delete
                
                
                My first question is, what would cause a problem
                  like this?
                

                I simply restarted that directory and then the
                  update succeeded.  However, when the update went to
                  the other 5 servers, they failed in the same way and
                  the same error was logged in their log files.  But the
                  update wasn’t retried.  It was just skipped and future
                  updates via replication succeeded on those 5 servers.
                

                My second question is, what’s the best way to
                  monitor for these types of replication errors?  In
                  this case, nsds5replicaLastUpdateStatus did not
                  indicate a problem.  If I had not been looking at the
                  error file on those 5 hosts, I’m wondering how I would
                  have known that a delete failed to replicate to them.
                   If the answer is to just have something monitoring
                  the error log files, are there specific search strings
                  to look for to separate out updates that have failed
                  and won’t be retried from other errors (e.g. temporary
                  connection issues)?  Just curious if there is a best
                  practice here.
                

                Thanks!
                

                — Shilen
                

                --
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxxhttps://admin.fedoraproject.org/mailman/listinfo/389-users
              
              
      --
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users
    
    
--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-users