[Fedora-directory-devel] Please review: Bug 487425 - slapd crashes after changelog is moved

Rich Megginson <rmeggins@xxxxxxxxxx> · Wed, 04 Mar 2009 10:45:49 -0700

https://bugzilla.redhat.com/show_bug.cgi?id=487425
Resolves: bug 487425
Bug Description: slapd crashes after changelog is moved
Reviewed by: ???
Files: see diff
Branch: HEAD
Fix Description: There are a number of real fixes, mixed in with many 
changes for debugging and instrumentation.
1) When the update thread gets the changelog iterator, it will use 
_cl5AddThread to increment the count of threads holding an open handle 
to the changelog.  When it releases the iterator, or if there were some 
error acquiring the database handle, it will decrement the thread 
count.  The way it used to work was that it would increment the thread 
count when retrieving the DB object, but then would immediately 
decrement it, meaning it had an open handle to the database, but there 
was no way for the changelog code to know that (except via the reference 
count on the DB object itself).
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_api.c_sec4
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_api.c_sec5
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_api.c_sec6
2) Changed the AddThread code to increment the thread count outside of 
the state lock - this better fits the semantics of the other uses of 
threadcount which are outside of the lock.
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_api.c_sec12
3) The changelog code that closes the databases was not closing things 
down in the correct order.  The first thing it must do is wait for all 
threads with open database handles or otherwise accessing the database 
to terminate.  Once that is done, it can call _cl5DBClose() to actually 
close all of the databases.  Otherwise, a race condition could cause a 
database to be accessed after it has been closed.
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_api.c_sec16
4) Added clcache cleanup code, and made it possible to re-init the 
clcache.  The clcache was not designed to be dynamically closed and opened.
clcache is init-ed in _cl5Open - 
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_api.c_sec8
clcache_init is re-entrant - 
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_clcache.c_sec1
Added more code to clean up the clcache - 
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_clcache.c_sec4
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_clcache.c_sec5
Delete the clcache in _cl5Delete - 
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_api.c_sec18
5) The clcache stores the current buffer in a thread private storage 
area.  If the clcache has been re-initialized, this buffer is also 
invalid and the clcache code must get a new buffer.
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_clcache.c_sec2
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
https://bugzilla.redhat.com/attachment.cgi?id=334019&action=diff

--
Fedora-directory-devel mailing list
Fedora-directory-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-directory-devel