Re: changelog program - _cl5AddThread - Invalid changelog state - 2

Mark Reynolds <mreynolds@xxxxxxxxxx> · Tue, 19 Oct 2021 11:54:23 -0400



    On 10/19/21 11:07 AM, Kees Bakker
      wrote:

    
     On
      19-10-2021 15:58, Kees Bakker wrote:

       On
        19-10-2021 14:13, Mark Reynolds wrote:

        On
          10/19/21 5:35 AM, Kees Bakker wrote: 

          On 18-10-2021 20:18, Mark Reynolds
            wrote: 

            On 10/18/21 1:52 PM, Kees Bakker
              wrote: 

              On 18-10-2021 16:30, Mark Reynolds
                wrote: 

                On 10/18/21 8:17 AM, Kees Bakker
                  wrote: 

                  Hi, 

                    
                    Today I tried 389-base 1.4.4.17 for a fix of retro
                    cl trimming [1] 

                    
                    Unfortunately the ns-slapd got into some sort of
                    deadlock, I think. 

                    Anyway, I reverted 389-base back to 1.4.3.23. 

                  
                  Yeah the replication changelog was moved in 1.4.4, so
                  by 

                  downgrading you 

                  most likely corrupted the changelog.  Stop the server,
                  remove the old 

                  changelog: /var/lib/dirsrv/slapd-INST/db/changelogdb,
                  and the new one 

/var/lib/dirsrc/slapd-INST/db/userroot/replication_changelog.db 

                
                Hmm. I don't have these (files?). 

              
              /var/lib/dirsrv/slapd-INST/db/changelogdb/  <===  this
              directory, and 

              its contents, are the 1.4.3 replication changelog
              (typically defined in 

              cn=changelog5,cn=config). 

            
            We don't have cn=changelog5,cn=config 

          
          Right, you need to create it to fix the issue on 1.4.3, like
          in the link 

          I sent you below.  Otherwise you have no changelog and
          replication can 

          not work. 

        
        Ah, you're right. I really appreciate your feedback.

        
        I looked on all the backups, we never had
        /var/lib/dirsrv/slapd-INST/db/changelogdb/

        However, we did have /var/lib/dirsrv/slapd-INST/cldb/ which is
        now gone.

        
        With that in mind I constructed an ldapmodify to re-created the
        cldb database.

        
        From what I can see the replication agreements are still
        present. I now have to make sense of the last few bullets

        
               Use one supplier, a data master, as the
                source for initializing consumers. 
            
          
          Check.

          
               Do not reinitialize
                a data master when the replication agreements are
                created. For example, do not initialize server1 from
                server2 if server2 has already been initialized from
                server1. 
            
          
          Makes sense, if I understand correctly.

          
               For a multi-master scenario, initialize
                all of the other master servers in the configuration
                from one master. 
            
          
          Check.

          
               For cascading replication, initialize
                all of the hubs from a supplier, then initialize the
                consumers from the hubs. 
            
          
          Not applicable for us.

          
      Everything is back to normal, it seems. In summary, I did:

      * re-create the changelog db

      * re-initialize the replica

      * restart dirsrv

      ipa-healthcheck is happy too.

    
    Sorry in a lot of meetings today and couldn't respond sooner. 
      But yes this is what I would have suggested.  The server restart
      was probably not needed, but it doesn't hurt.  Glad you're up and
      running!
    Mark

    
          HTH, 

          Mark 

          
            On the other hand, we do have cn=changelog,cn=ldbm 

            database,cn=plugins,cn=config 

            with 

            nsslapd-directory: /var/lib/dirsrv/slapd-INST/db/changelog 

            
/var/lib/dirsrc/slapd-INST/db/userroot/replication_changelog.db <===
              in 

              1.4.4 we moved the global replication changelog into a
              database file for 

              each backend. 

              
              If you don't see these, then there is nothing to clean up.
              

                I do have this directory:
                /var/lib/dirsrv/slapd-INST/db/changelog 

                Should I remove that whole directory? 

              
              No, that is the retro changelog database.  There is no
              need to remove 

              it. 

              
              So I suspect the downgrade to 1.4.3 screwed everything
              up.  So sounds 

              like you need to simply recreate the replication changelog
              (if you are 

              are staying on 1.4.3), please follow the instructions from
              this link: 

              
              https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/html/administration_guide/managing_replication-configuring-replication-cmd#Configuring-Replication-Suppliers-cmd
              

            I don't feel comfortable to execute the commands in this
            document. If 

            there are no  "simpler" methods then I will probably
            re-install the 

            replica. In any event, I want to be very careful before I
            take the 

            next step. 

             
              Just to be safe, it might be a good idea to restart the
              server after 

              adding the replication changelog config entry. 

              
              HTH, 

              Mark 

              
                  Start the server, and reinit the agreements on this
                  server 

                  
                  That should do it. 

                  
                  Mark 

                  
                    But now I have a replication problem. Could this
                    have been caused by 

                    the update to 1.4.4.17 ? And, if yes, how can I fix
                    this? 

                    
                    [18/Oct/2021:12:17:41.750334062 +0200] - ERR - 

                    NSMMReplicationPlugin - 

                    changelog program - _cl5AddThread - Invalid
                    changelog state - 2 

                    [18/Oct/2021:12:17:41.782505596 +0200] - ERR - 

                    NSMMReplicationPlugin - 

                    send_updates -
                    agmt="cn=iparep4.example.com-to-rotte.example.com" 

                    (rotte:389): Changelog database was in an incorrect
                    state 

                    [18/Oct/2021:12:17:41.827732779 +0200] - ERR - 

                    NSMMReplicationPlugin - 

                    repl5_inc_run -
                    agmt="cn=iparep4.example.com-to-rotte.example.com" 

                    (rotte:389): Incremental update failed and requires
                    administrator 

                    action 

                    
                    [1] https://github.com/389ds/389-ds-base/pull/4895
                    

                  -- 

                  Directory Server Development Team 

                  
              -- 

              Directory Server Development Team 

              
          -- 

          Directory Server Development Team 

          
        _______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

      
      _______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

    
    -- 
Directory Server Development Team
  
_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure