What is output of rpm -q 389-ds-base?
I have 6 replicas (two of which are read-only). I
ran into an issue where a DELETE operation failed on a
server with error code 51 (ldap busy).
[21/Oct/2014:23:44:44 -0400]
conn=78160 op=39510 RESULT err=51 tag=107 nentries=0
etime=3 csn=5447282c000300050000
The application retried the delete several times
for a couple of hours (while the server wasn’t getting
any other requests) and the result was always the same
(err=51). Each time that happened, the error log had
the following:
[21/Oct/2014:23:44:44 -0400] -
Retry count exceeded in delete
My first question is, what would cause a problem
like this?
I simply restarted that directory and then the
update succeeded. However, when the update went to
the other 5 servers, they failed in the same way and
the same error was logged in their log files. But the
update wasn’t retried. It was just skipped and future
updates via replication succeeded on those 5 servers.
My second question is, what’s the best way to
monitor for these types of replication errors? In
this case, nsds5replicaLastUpdateStatus did not
indicate a problem. If I had not been looking at the
error file on those 5 hosts, I’m wondering how I would
have known that a delete failed to replicate to them.
If the answer is to just have something monitoring
the error log files, are there specific search strings
to look for to separate out updates that have failed
and won’t be retried from other errors (e.g. temporary
connection issues)? Just curious if there is a best
practice here.
Thanks!
— Shilen
--
389 users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxxhttps://admin.fedoraproject.org/mailman/listinfo/389-users