Hi Andrey,
we have fix to address the incorrcet positioning in the changelog
(using a csn of a consumer which is ahead for the given replicaid)
and so also would prevent these messages.
It still has to be tested, but I am wondering if you want to test it
as well.
Regards,
Ludwig
On 09/07/2016 03:33 PM, Ivanov Andrey
(M.) wrote:
the
fixes for the tickets you mention did change
the iteration thru the changelog and how it
handles situtations when the start csn is not
found in the changelog. and it also did change
the logging, so you might see messages now
which were not there or hidden before.
That was my understanding too.
so far I have not seen any replication problems
related to these messages, all generatedcsns seem to
be replicated. What makes it a bit more difficult is
that most of the updates are updates of lastlogintime
and the original MOD is not logged. I still do not
understand why we have these messages so frequently, I
will try to reproduce.
Or, if it possible, could you run the servers for just
an hour with replication logging enabled ?
no more need for this, I found the messages in a
deployment where repl logging was enabled. I think it
happens when the smallest consumer maxCSN is ahead of
the local maxCSN for this replicaID.
It should do no harm, but in some scenarios could slow
down replication a bit.
I will continue to investigate and work on a fix
Ok, thank you. And yes, as you say apparently it does
no harm - i check the consistency of three replicated
servers from time to time and there is no data
discrepancy between these servers, .
Anyway, enabling replication logging on production
servers is not something easily done, mainly due to
performance reasons. And i was not able to reproduce the
problem in our test environment with 2 replicated
servers, maybe the charge or frequency of connections
updating lastlogintime attribute was not high enough in
test environment. Or the three-server full-replicated
topology makes things a bit different too with one or
two additional hops for the same mod arriving to the
consumer by two different paths.
When looking into the provided data set I did notice
three replicated ops with err=50, insufficient access.
This should not happen and requires a separate
investigation
Yes, i see the three modifications you are talking
about. it is present only on one server of three.
Strange indeed. No more err=50 in replicated ops today
on any of the servers, i've just checked.
--
389-users mailing list
389-users@xxxxxxxxxxxxxxxxxxxxxxx
https://lists.fedoraproject.org/admin/lists/389-users@xxxxxxxxxxxxxxxxxxxxxxx
--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric Shander
|