Re: Enabling retro changelog maxage with 3 million entries make dirsrv not respond anymore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 9/6/21 3:40 PM, Kees Bakker wrote:
On 06-09-2021 14:34, Thierry Bordaz wrote:
On 9/6/21 1:55 PM, Kees Bakker wrote:
Hi,

First a bit of context.

CentOS 7, FreeIPA
389-ds-base-snmp-1.3.9.1-13.el7_7.x86_64
389-ds-base-libs-1.3.9.1-13.el7_7.x86_64
389-ds-base-1.3.9.1-13.el7_7.x86_64

A long time ago I was experiencing a deadlock during retro changelog cleanup
and I was advised to disable it as a workaround. Disabling was done by setting
nsslapd-changelogmaxage to -1. SInce then the number of entries grew to
about 3 million.

Last week I enabled maxage again. I set it to 470 days. I was hoping to limit
this pile of old changelog entries., starting by cleaning very old entries.

However, what I noticed is that it was removing entries with a pace of 16 entries
per second. Meanwhile the server was doing nothing. Server load was very low.

The real problem is that dirsrv (LDAP) is not responding to any requests anymore. I
had to disable maxage again, which requires patience restarting the server when
it is not responding ;-)

Now my questions
1) is it normal dat removing repo changelog entries is slooow?
2) why is dirsrv not responding anymore when the cleanup kicks in?
3) are there alternatives to cleanup the old repo changelog entries?

Hi,

When the server is not responsive, can it process searches like

ldapsearch -b "" -s base ?

ldapsearch -D 'cn=direcrtory manager' -W -b "cn=config" -s base

or ldapsearch D 'cn=direcrtory manager' -W -b "cn=monitor" ?

I'll have to do this when I get a new chance. This LDAP server is
hard coded in several other services, even though we have replica's.
These services will be hanging when I do this.

One thing I can say is that the following command was hanging.

ldap -H ldaps://rotte.example.com -b cn=config
Interesting, I was "guessing" db update+checkpointing+compact being responsible of some temporary slowdown.  cn=config backend (in memory) being also frozen, the thing I can imagine is that others SRCH requests, that go to the db, were frozen by update+checkpoint+compact and there was no more workers to process request that are non database related.

Regarding the low rate of trimming, how did you monitor it ? Are you using internal op logging, plugin log level or something else ?

Just a rough estimate. After 15 minutes I had to disable maxage again.
Before and after I looked at the oldest entry. That way I saw it removed
about 15000 entries.


So you were able to do online update of maxage. So the server were (very) slow but still processing ?


Is there any particular logging you can recommend?
I was concerned that you enable debug level, to monitor the trimming, that would be very noisy.

When the server is not responsive, does it consum CPU ? Could you collect 'top -H -p `pidof ns-slapd` -b' and some pstack ?

As I said above, I'll have to pick the right moment to do this again.
Last time I got a lot of complaints from the users. :-(


Yes, this can be done during calm period.

thanks
thierry

-- Kees

thanks
thierry


_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure


_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

[Index of Archives]     [Fedora User Discussion]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [Fedora News]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Maintainers]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [Fedora Fonts]     [ATA RAID]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora QA]     [Fedora Triage]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Tux]     [Yosemite News]     [Yosemite Photos]     [Linux Apps]     [Maemo Users]     [Gnome Users]     [KDE Users]     [Fedora Tools]     [Fedora Art]     [Fedora Docs]     [Maemo Users]     [Asterisk PBX]     [Fedora Sparc]     [Fedora Universal Network Connector]     [Fedora ARM]

  Powered by Linux