Re: Determining max CSN of running server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On 29 Feb 2024, at 05:20, William Faulk <d4hgcdgdmj@xxxxxxxxxxxxxx> wrote:
> 
> I'm having another replication problem where changes made on a particular server are not being replicated outward at all. Right now, I'm trying to determine what's going on during the replication process.
> 
> (Caveat: I'm still running an old version of 389ds: v1.3.10. In particular, the dsconf utility does not exist.)
> 
> My understanding is that when a server receives a change from a client, it wraps it up as a CSN and starts a replication session with its peers, during which it sends a message that states the greatest CSN that it originated. First off, is that a correct understanding?

Might be worth re-reading https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx/thread/UYP4PYBVVDKGKZVZTC34JVXNUVP2VAVI/ 

It doesn't send a single CSN, the replication compares the RUVs and determines the range of CSNs that are missing from the consumer. 

It's also not immediate. Between the server accepting a change (add, mod etc), the change is associated to a CSN. But then there may be a delay before the two nodes actually communicate and exchange data. 

> 
> If so, how can I determine what CSN a particular server is telling its replication peers during those sessions? I have a feeling that this server is, for some reason, sending an inaccurate number.

Generally you'd need replication logging (errorloglevel 8192). But it's very noisy and can be hard to read. What you need to see is the ranges that they agree to send.

Also remember CSN's are a monotonic lamport clock. This means they only ever advance and can never step backwards. So they have some different properties to what you may expect. If they ever go backwards I think the replication handler throws a pretty nasty error.

> 
> In the cn=replica,cn=...,cn=mapping tree,cn=config tree, there are entries for each of the servers topology peers, and they contain nsds50ruv attributes that seem to be the RUVs that that server has received from those peers, right? But the nsds50ruv attribute also exists directly in the cn=replica if you explicitly ask for it. Is it possible that this is the server's own RUV?

I *think* so. It's been a while since I had to look. The nsds50ruv shows the ruv of the server, and I think the other replica entries are "what the peers ruv was last time". But I think Thierry or Pierre would know more about that then me. Some of the replication monitoring code in newer versions does this for you, so I'd probably advise you attempt to upgrade your environment. 1.3 is really old at this point (And I'm not sure if even RH or SUSE still support that version anymore).

> 
> Can I rely on the nsds50ruv attributes on this server's peers'  cn=replica nsds50ruv attribute values to be an accurate reflection of what this server is sending as its CSN in replication sessions?
> 
> Any other way to see what's going on in a replication session? (I'm even trying to decrypt a network capture, but I'm not having any luck with that yet.)
> 
> In particular, I see the max CSN for this server in all of these RUVs less than CSNs recorded in the server's own log files.

The problem here is that to read the RUV's and then compare them, you need to read each RUV from each server and then check if they are advancing (not that they are equal). See, it's okay if RUV's are not the same between two servers, because that can simply indicate that a server has accepted a write and not yet sent it to another node. In fact it's common in busy environments that every server has "slightly different state" because they have to continually replicate and converge. 

For example, imagine some user A changes their password. Now that change has to propogate and converge between all the nodes in the topology. While that convergence is occuring, then another user B could be changing their password. This can leave with servers where:

* A and B passwords are original
* A password is changed, B original
* A password origin, B changed
* A and B have been changed.

And all four of these states are valid! 

If you want to assert that "Some change I made at CSN X is on all servers" then you would need to read and parse the ruv and ensure that all of them are at or past that CSN for that replica id. 

Either way - it's not trivial :) 


--
Sincerely,

William Brown

Senior Software Engineer,
Identity and Access Management
SUSE Labs, Australia
--
_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Index of Archives]     [Fedora User Discussion]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora News]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora QA]     [Fedora Triage]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Yosemite Photos]     [Linux Apps]     [Maemo Users]     [Gnome Users]     [KDE Users]     [Fedora Tools]     [Fedora Art]     [Fedora Docs]     [Maemo Users]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Fedora ARM]

  Powered by Linux