Re: [PATCH] prevent slapd from hanging under unlikely circumstances

Jay Fenlason <ds389@xxxxxxxxxxxxxxx> · Mon, 3 Feb 2020 10:57:07 -0500

Here's a backtrace of the hung server:
All of the other threads are in pthread_cond_wait(), select() or poll()

#0  0x00007fce1454f35e in pthread_rwlock_wrlock ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:85
#1  0x00007fce16e1acca in slapi_rwlock_wrlock (rwlock=<optimized out>)
    at ldap/servers/slapd/slapi2nspr.c:237
#2  0x00007fce06a74748 in wrap_rwlock_wrlock (rwlock=<optimized out>)
    at wrap.c:328
#3  0x00007fce06a7344c in plugin_wrlock () at map.c:1242
#4  0x00007fce06a5d639 in backend_be_pre_write_cb (pb=<optimized out>)
    at back-sch.c:2381
#5  0x00007fce16df8028 in plugin_call_func (list=0x55ddade18840, operation=operation@entry=453, pb=pb@entry=0x55ddaf08afc0, call_one=call_one@entry=0)
    at ldap/servers/slapd/plugin.c:2028
#6  0x00007fce16df82e3 in plugin_call_list (pb=0x55ddaf08afc0, operation=453, list=<optimized out>) at ldap/servers/slapd/plugin.c:1972
#7  0x00007fce16df82e3 in plugin_call_plugins (pb=pb@entry=0x55ddaf08afc0, whichfunction=whichfunction@entry=453) at ldap/servers/slapd/plugin.c:442
#8  0x00007fce08a31c88 in ldbm_back_delete (pb=0x55ddaf08afc0)
    at ldap/servers/slapd/back-ldbm/ldbm_delete.c:373
#9  0x00007fce16da8feb in op_shared_delete (pb=pb@entry=0x55ddaf08afc0)
    at ldap/servers/slapd/delete.c:324
#10 0x00007fce16da9383 in do_delete (pb=pb@entry=0x55ddaf08afc0)
    at ldap/servers/slapd/delete.c:97
#11 0x000055ddac64e62c in connection_dispatch_operation (pb=0x55ddaf08afc0, op=0x55ddadd17340, conn=0x55ddaf12ed00) at ldap/servers/slapd/connection.c:615
#12 0x000055ddac64e62c in connection_threadmain ()
    at ldap/servers/slapd/connection.c:1790
#13 0x00007fce14babc5b in _pt_root (arg=0x55ddaf0186c0)
    at ../../../nspr/pr/src/pthreads/ptthread.c:201
#14 0x00007fce1454be65 in start_thread (arg=0x7fcdf405d700)
    at pthread_create.c:307
#15 0x00007fce13bf788d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111



My thought when writing the patch was that calls to the pre delete callback
and the post delete callback should be balanced: either both of them should be
called or neither.  But you know the plugin abi better than I do, and if
you think it's ok to have the post callback called without a corresponding pre
who am I to argue?  I've just confirmed that this patch also prevents
the hangs I'm seeing:

--- a/ldap/servers/slapd/back-ldbm/ldbm_delete.c.orig	2020-01-31 07:28:04.085861174 -0500
+++ b/ldap/servers/slapd/back-ldbm/ldbm_delete.c	2020-01-31 07:30:33.932947489 -0500
@@ -81,6 +81,7 @@
     Connection *pb_conn;
     int32_t parent_op = 0;
     struct timespec parent_time;
+    int pre_delete_called = 0;
 
     if (slapi_pblock_get(pb, SLAPI_CONN_ID, &conn_id) < 0) {
         conn_id = 0; /* connection is NULL */
@@ -371,6 +372,7 @@
                 }
                 if (retval == 0) {
                     retval = plugin_call_plugins(pb, SLAPI_PLUGIN_BE_PRE_DELETE_FN);
+		    pre_delete_called = 1;
                 }
                 if (retval)
                 {
@@ -1491,7 +1493,7 @@
      * The bepostop is called even if the operation fails,
      * but not if the operation is purging tombstones.
      */
-    if (!delete_tombstone_entry) {
+    if (!delete_tombstone_entry || pre_delete_called) {
         plugin_call_plugins(pb, SLAPI_PLUGIN_BE_POST_DELETE_FN);
     }
     /* Need to return to cache after post op plugins are called */


Signed-off-by: Jay Fenlason <ds389@xxxxxxxxxxxxxxx>
_______________________________________________
389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-devel@xxxxxxxxxxxxxxxxxxxxxxx