Re: Chain on Update problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks.

I have tested various combinations of the tuning params without success. I've done further debugging and confirmed that it always starts after a bind operation timeout. Looking into the chaining plugin code, I see that on operation timeout results in a call to cb_ping_farm to see if we can find another server in the pool that is available. However, it performs this operation (the comment is telling);

    /* NOTE: This will fail if we implement the ability to disable
       anonymous bind */
    rc = ldap_search_ext_s(ld, NULL, LDAP_SCOPE_BASE, "objectclass=*", attrs, 1, NULL,
                           NULL, &timeout, 1, &result);
    if (LDAP_SUCCESS != rc) {
        slapi_ldap_unbind(ld);
        cb_update_failed_conn_cpt(cb);
        return LDAP_SERVER_DOWN;
    }

So basically, because we've disallowed anonymous bind for anything but rootdse, it will always fail to find another available server. I have confirmed this by allowing anonymous bind on our masters while the issue was present, then subsequent binds on the consumers start working again.

I would think it more appropriate for that code to do a search against the rootdse instead. Is there any good reason why it shouldn't? If not, I might test modifying it.


Thanks,
Grant



From: William Brown <wbrown@xxxxxxx>
Sent: Friday, March 5, 2021 3:52 PM
To: 389-users@xxxxxxxxxxxxxxxxxxxxxxx <389-users@xxxxxxxxxxxxxxxxxxxxxxx>
Subject: [389-users] Re: Chain on Update problem
 
[External Mail]


> On 5 Mar 2021, at 12:03, Grant Byers <Grant.Byers@xxxxxxxxxxxxx> wrote:
>
> Hi All,
>
> Version: 1.3.10
>
> In our environment, we'd like to use a chaining backend to push BIND operations up to masters by way of the consumer (rather than client referral). We'd like to do this to ensure password lockout attributes are propagated to all consumers equally via our standard replication agreements. This is described here - https://directory.fedoraproject.org/docs/389ds/howto/howto-chainonupdate.html.
>
> NOTE, we do not have hubs in our topology. Just masters and consumers, so no intermediate chaining.
>
> We tested this process in our environment and it worked beautifully until we took it to production. Currently, we have just 2 masters and they are both sitting on some over-subscribed hardware that suffers from I/O starvation at certain times of the day. The plan is to scale out our masters eventually, but we're a little hamstrung with other projects and priorities. It worked extremely well until that time of day when masters suffered from I/O starvation, and hence, very long I/O wait times. This is generally short lived and happens at alternate times of the day for each of the masters. However, it seems that once both nsfarmservers have "failed", there is never any attempt by the consumer to retry them. This leads to bind errors as follows;
>
> ldapwhoami -x -D "<binddn>" -W
> Enter LDAP Password:
> ldap_bind: Operations error (1)
>         additional info: FARM SERVER TEMPORARY UNAVAILABLE
>
> Except it is not temporary. It never recovers, even though all members of nsfarmservers are now healthy again (and are never unhealthy at the same time). We can confirm this by performing binds from the consumers directly against the masters. I thought that setting nsConnectionLife to something larger than 0 (indefinite) would help this, but it has not.

The chain on update appears to use the chaining plugin timeouts, so you could look at adjusting these parameters which may help.

nsBindTimeout
nsOperationTimeout
nsBindRetryLimit
nsMaxResponseDelay
nsMaxTestResponseDelay



>
> Is this by design, a bug, or an implementation fault on my behalf? Configuration below;
>
> Thanks,
> Grant
>
>
>
> ## On masters, create a dedicated user for chaining backend
> dn: cn=proxy,cn=config
> objectClass: person
> objectClass: top
> cn: proxy
> sn: admin
>
> ## On all consumers, create chaining backend;
> dn: cn=chainbe1,cn=chaining database,cn=plugins,cn=config
> objectclass: top
> objectclass: extensibleObject
> objectclass: nsBackendInstance
> nsslapd-suffix: <suffix>
> nsfarmserverurl: ldaps://<master1>:636 <master2>:636/
> nsMultiplexorBindDN: <binddn>>
> nsMultiplexorCredentials: <bindpw>
> nsCheckLocalACI: on
> nsConnectionLife: 30
> cn: chainbe1
>
> ## On all consumers, add the backend and repl_chain_on_update function
> dn: cn="<suffix>",cn=mapping tree,cn=config
> changetype: modify
> add: nsslapd-backend
> nsslapd-backend: chainbe1
> -
> add: nsslapd-distribution-plugin
> nsslapd-distribution-plugin: libreplication-plugin
> -
> add: nsslapd-distribution-funct
> nsslapd-distribution-funct: repl_chain_on_update
>
> ## On all servers, enable global pasword policy
> dn: cn=config
> changetype: modify
> replace: passwordIsGlobalPolicy
> passwordIsGlobalPolicy: on
>
> _______________________________________________
> 389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure


Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs, Australia
_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

[Index of Archives]     [Fedora User Discussion]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora News]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora QA]     [Fedora Triage]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Yosemite Photos]     [Linux Apps]     [Maemo Users]     [Gnome Users]     [KDE Users]     [Fedora Tools]     [Fedora Art]     [Fedora Docs]     [Maemo Users]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Fedora ARM]

  Powered by Linux