Replication delay, connection blocking ending in closed - B1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello –

 

We are seeing an issue where changes can be very slow to replicate to one of our consumers (up to 15m+).  We have a large topology, but in this case the issue is isolated between 2 masters that replicate to 1 consumer.

 

In one example entry addition I found, it appears that we see;

 

-          connection from master1 to consumer1, bunch of changes pushed

-          then master2 connects to consumer1

-          master1 stops doing changes but stays connected

-          master2 does literally 1 change (success, no issues), stays connected

-          16 minutes goes by, no additional changes or replication EXT ops for that connection are done (master1->consumer1 EXT ops continue normally…)

-          after that long pause, master2 disconnects - B1 bad BER tag code

-          and then Master1 resumes making tons of changes

 

Searches in this DB and others continue to take place, and changes in other DBs.  So it wasn’t as if the server was unresponsive/hung.  It is almost as if that DB went “read-only” for a time – I’m unable to tell if something else besides replication was attempting but unable to make writes though.

 

 

Anyone see something like this before?  We see lots of B1 codes randomly, I’ve never understood what may cause that – the description of corruption/physical network problems does not make much sense.  Maybe if our directories were replicating to eachother over the internet, or using Wifi….

 

The time it takes for that connection to end in the B1 doesn’t seem to line up with any dirsrv OR system/TCP timeouts either

 

 

Nothing illuminating in the error logs really.

 

Log snips of this happening;

 

[03/Mar/2021:13:50:34.395939343 -0600] conn=270076 fd=280 slot=280 connection from master1 to consumer1

...

[03/Mar/2021:13:50:34.608149165 -0600] conn=270076 op=13 MOD dn="cn=CRLblahblah,c=US"

[03/Mar/2021:13:50:34.609565373 -0600] conn=270076 op=13 RESULT err=0 tag=103 nentries=0 etime=0.001451229 csn=603febdd00035aae0000

[03/Mar/2021:13:50:34.818645762 -0600] conn=270076 op=14 EXT oid="2.16.840.1.113730.3.5.5" name="replication-multimaster-extop"

[03/Mar/2021:13:50:34.820544342 -0600] conn=270076 op=14 RESULT err=0 tag=120 nentries=0 etime=0.002004143

 

[03/Mar/2021:13:50:34.460210520 -0600] conn=270077 fd=307 slot=307 connection from master2 to consumer1

[03/Mar/2021:13:50:34.460676562 -0600] conn=270077 op=0 BIND dn="cn=replication manager,cn=config" method=128 version=3

[03/Mar/2021:13:50:34.460992487 -0600] conn=270077 op=0 RESULT err=0 tag=97 nentries=0 etime=0.000376400 dn="cn=replication manager,cn=config"

<snipped replication startup jargon>

[03/Mar/2021:13:50:35.715331361 -0600] conn=270077 op=5 MOD dn="cn=CRLblahblah,c=US"

[03/Mar/2021:13:50:35.717330540 -0600] conn=270077 op=5 RESULT err=0 tag=103 nentries=0 etime=0.002066162 csn=603febdd00045aae0000

...

[03/Mar/2021:13:50:36.828647054 -0600] conn=270076 op=15 EXT oid="2.16.840.1.113730.3.5.12" name="replication-multimaster-extop"

[03/Mar/2021:13:50:36.828888493 -0600] conn=270076 op=15 RESULT err=0 tag=120 nentries=0 etime=0.000368293

[03/Mar/2021:13:50:37.112334122 -0600] conn=270076 op=16 EXT oid="2.16.840.1.113730.3.5.12" name="replication-multimaster-extop"

[03/Mar/2021:13:50:37.112782309 -0600] conn=270076 op=16 RESULT err=0 tag=120 nentries=0 etime=0.000624719

[03/Mar/2021:13:50:38.113792312 -0600] conn=270076 op=17 EXT oid="2.16.840.1.113730.3.5.12" name="replication-multimaster-extop"

[03/Mar/2021:13:50:38.114092751 -0600] conn=270076 op=17 RESULT err=0 tag=120 nentries=0 etime=0.000476209

continued EXT ops on conn=270076, from master1

[03/Mar/2021:14:06:07.872623403 -0600] conn=270077 op=-1 fd=307 closed - B1

[03/Mar/2021:14:06:07.916640931 -0600] conn=270076 op=1106 EXT oid="2.16.840.1.113730.3.5.12" name="replication-multimaster-extop"

[03/Mar/2021:14:06:07.917191372 -0600] conn=270076 op=1106 RESULT err=0 tag=120 nentries=0 etime=0.000662946

[03/Mar/2021:14:06:07.921837193 -0600] conn=270076 op=1107 MOD dn="cn=CRLblahblah,c=US"

[03/Mar/2021:14:06:07.923754219 -0600] conn=270076 op=1107 RESULT err=0 tag=103 nentries=0 etime=0.002036469 csn=603febdd00065aae0000

and a flood of changes now

...

 

 

Colin Tulloch

Architect, USmPKI

colin.tulloch@xxxxxxxxxxx

 

_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

[Index of Archives]     [Fedora User Discussion]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora News]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora QA]     [Fedora Triage]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Yosemite Photos]     [Linux Apps]     [Maemo Users]     [Gnome Users]     [KDE Users]     [Fedora Tools]     [Fedora Art]     [Fedora Docs]     [Maemo Users]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Fedora ARM]

  Powered by Linux