On Mar 22, 2013, at 12:04 PM, Rich Megginson wrote: > On 03/21/2013 02:45 PM, Morgan Jones wrote: >> Hello everyone, >> >> We've standardized on CentOS Directory our ~30,000 user directory environment. It's a 6 servers total: two multi-master, two read-only consumers with a full replication agreement and two read-only consumers with a partial replication. >> >> We have a specific problem that we were *sure* was fixed in CentOS directory 8.2.8. > > What problem was that, and why were you sure it was fixed in centos-ds-base 8.2.8? The problem is the "Bad parameter to an ldap routine" error below: [19/Mar/2013:17:59:49 -0400] NSMMReplicationPlugin - agmt="cn=ldapm01-mgmt to ds01-mgmt" (ds01-mgmt:636): Failed to send update operation to consumer (uniqueid c3230b03-18e411e2-af56b819-045c296a, CSN 5148b7cd000100010000): Bad parameter to an ldap routine. Will retry later. [19/Mar/2013:17:59:49 -0400] NSMMReplicationPlugin - agmt="cn=ldapm01-mgmt to ds02-mgmt" (ds02-mgmt:636): Failed to send update operation to consumer (uniqueid c3230b03-18e411e2-af56b819-045c296a, CSN 5148b7cd000100010000): Bad parameter to an ldap routine. Will retry later. This bug looks similar to what we were seeing and is why were upgraded to 8.2.8: https://fedorahosted.org/389/ticket/317 We aren't using the attributes in question but the bug could be extrapolated to other attributes I imagine. We cannot keep our partial replicas synchronized so until we sort this we have been re-initializiing replication a few times a day--not a great situation. I have a hard time imaging that partial replication is this broken which makes me suspect we're doing something wrong but so far we haven't found anything. I have a long history with Sun Directory so I do know the product well. > >> It was not and now we're wondering if we'd be better off on 389 or Redhat Directory since we'd at least have reliable changelogs with the former and support to call for the latter. >> >> Here's the problem, in the master error logs: >> [19/Mar/2013:17:59:49 -0400] NSMMReplicationPlugin - agmt="cn=ldapm01-mgmt to ds01-mgmt" (ds01-mgmt:636): Failed to send update operation to consumer (uniqueid c3230b03-18e411e2-af56b819-045c296a, CSN 5148b7cd000100010000): Bad parameter to an ldap routine. Will retry later. >> [19/Mar/2013:17:59:49 -0400] NSMMReplicationPlugin - agmt="cn=ldapm01-mgmt to ds02-mgmt" (ds02-mgmt:636): Failed to send update operation to consumer (uniqueid c3230b03-18e411e2-af56b819-045c296a, CSN 5148b7cd000100010000): Bad parameter to an ldap routine. Will retry later. >> >> It repeats once every few seconds. Reinitializing replication solves it for a while, maybe an hour and then it re-occurs. >> >> Recently we've started seeing this: >> [21/Mar/2013:15:00:01 -0400] NSMMReplicationPlugin - agmt="cn=ldapm01-mgmt to ds01-mgmt" (ds01-mgmt:636): Unable to acquire replica: there is no replicated area "dc=philasd,dc=org" on the consumer server. Replication is aborting. >> >> deleting the host as a replica and re-adding it solves it but it shouldn't be happening. > Could be related to https://fedorahosted.org/389/ticket/374 >> >> They were Sun Directory customers going to back to 5.2 so this product is comfortable for them but with the pain we've been feeling as we roll it into production we're trying to decide if we should consider alternatives. >> >> Does anyone have insight on the problem above or on whether it's best to stick with CentOS, switch to Redhat or 389? >> >> They're heavy users of open source and happy to self support. If Redhat support for directory is good they'd be happy to go that direction. > > Disclaimer: I work for Red Hat on RHDS, so my opinion is entirely biased. If you pay for support, you will get good support. I know because I work on some of these escalations. That is helpful. We are loath to upgrade to the commercial version and find we still have problems. thanks, -morgan -- 389 users mailing list 389-users@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/389-users