Hi there, Would really like a second opinion here (especially looking at you Ludwig since you know this code well) about the behaviour of the filter test code. I have been following up on the filter optimisation patch to understand why in some conditions it causes IPA to fail. Examining the logs that Mark provided me (thank you!!!), I have checked a number of cases that looked like they could be suspect. Most of these were conditions where searches yielded nentries=0, but some conditions were true. An example of a query like this was also raised on the RH bugzilla. Here it is exploded to make it easier to see: ORIGINAL: (& (& (| (usercertificate;binary=) (ipaCertMapData=X509:<I>O=EXAMPLE.TEST,CN=Certificate Authority<S>O=EXAMPLE.TEST,CN=ipauser1) (altsecurityidentities=X509:<I>O=EXAMPLE.TEST,CN=Certificate Authority<S>O=EXAMPLE.TEST,CN=ipauser1) ) (objectClass=posixAccount) (uid=*) (& (uidNumber=*) (! (uidNumber=0) ) ) ) (objectClass=ipaIDObject) ) OPTIMISED: (| (objectclass=referral) (& (| (usercertificate;binary=) (ipaCertMapData=X509:<I>O=EXAMPLE.TEST,CN=Certificate Authority<S>O=EXAMPLE.TEST,CN=ipauser1) (altsecurityidentities=X509:<I>O=EXAMPLE.TEST,CN=Certificate Authority<S>O=EXAMPLE.TEST,CN=ipauser1) ) (objectClass=posixAccount) (uid=*) (uidNumber=*) (! (uidNumber=0) ) (objectClass=ipaIDObject) ) ) To me, observing this, these are identical - it appears the re-arrangement has worked to fold the nested & conditions, and the re-arrangement has not altered or lost components of the filter. As a result, I think that it is not the filter optimisation itself that is suspect, but the fact that the filter optimiser will cause the filter execution code to operate in a different order of operations. An important aspect of the optimisation is the we assume the filter test *must* be executed such that the remaining conditions of the filter are upheld. Observing the code I am wondering if there may be a logic error in the application of the filter test bypass flags. Note that in this case all line numbers are from branch: https://pagure.io/389-ds-base/pull-request/50252 All searches take place in ldbm_search.c, starting in ldbm_back_search on line 306. The actual candidate set is built on line 625 via build_candidate_list. On line ldbm_search.c:876 the filter bypass is evaluated by: if (li->li_filter_bypass && NULL != candidates && !virtual_list_view && !lookup_returned_allids) { } li_filter_bypass comes from ldbm_config.c:1614, and appears to default to 1 after inspection of dse.ldif (nsslapd-search-bypass-filter-test: on). candidates should be != NULL, and !vlv would be upheld. This means the condition now is on not-lookup_returned_allids. In build_candidate_list, lookup_returned_allids for a subtree search is set in subtree_candidates ldbm_search.c:1140. This is determined by checking that candidates is not NULL, and that candidates is equivalent to ALLIDS. Note that if the candidate set is partially evaluated here this would cause allids_before_scopingp (lookup_returned_allids) to be false. So at this point, all conditions on ldbm_search.c:876 should be met. Imagine we have a partial candidate set, below the filter test threshold. This would cause idl_intersection or idl_union etc to execute: slapi_be_set_flag(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST); to indicate the partial nature of the candidate set. However I believe the issue may come from can_skip_filter_test in ldbm_search.c:1289. In this function the flag SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST is *NOT* evaluated. Only *other* conditions are asserted about the state of the candidate set and scoping. This means the flag SR_FLAG_CAN_SKIP_FILTER_TEST may be incorrectly applied to sr->sr_flags. If my memory serves correctly, this then would cause ldbm_back_next_search_entry_ext ldbm_search.c:1643 to now consider the candidate set members valid and the filter test is *not* run. It’s important to note, that the DONT_BYPASS flag is only referenced in these locations: ds I0> grep -r -n -e 'BYPASS_FILTER' ldap ldap/servers/slapd/back-ldbm/ldbm_search.c:164: * In case SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST is set, ldap/servers/slapd/back-ldbm/ldbm_search.c:167: slapi_be_unset_flag(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST); ldap/servers/slapd/back-ldbm/ldbm_search.c:1087: *lookup_returned_allidsp = slapi_be_is_flag_set(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST); ldap/servers/slapd/back-ldbm/idl_common.c:248: slapi_be_set_flag(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST); ldap/servers/slapd/back-ldbm/idl_common.c:252: slapi_be_set_flag(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST); ldap/servers/slapd/back-ldbm/idl_common.c:366: slapi_be_set_flag(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST); ldap/servers/slapd/back-ldbm/idl_set.c:354: slapi_be_set_flag(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST); ldap/servers/slapd/back-ldbm/idl_set.c:374: slapi_be_set_flag(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST); ldap/servers/slapd/slapi-plugin.h:6455:#define SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST 0x10 /* force to call filter_test (search only) */ Importantly note, that the only location where we GET this flag is: ldap/servers/slapd/back-ldbm/ldbm_search.c:1087: *lookup_returned_allidsp = slapi_be_is_flag_set(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST); Which is used as part of onelevel_candidates to set the lookup_returned_allidsp flag. This could cause a non-matching candidate set to be returned as the filter test was *NOT* run. The candidate set would be legitimately returned due to the filter test threshold optimisation. I am concerned this may be the case, because the filter above was (likely) intended for the following entry: dn: uid=ipauser1,cn=users,cn=accounts,dc=example,dc=test givenName: f sn: f uid: ipauser1 cn: f f displayName: f f initials: ff gecos: f f krbPrincipalName: ipauser1@xxxxxxxxxxxx objectClass: top objectClass: person objectClass: organizationalperson objectClass: inetorgperson objectClass: inetuser objectClass: posixaccount objectClass: krbprincipalaux objectClass: krbticketpolicyaux objectClass: ipaobject objectClass: ipasshuser objectClass: ipaSshGroupOfPubKeys objectClass: mepOriginEntry objectClass: ipacertmapobject loginShell: /bin/sh homeDirectory: /home/ipauser1 mail: ipauser1@xxxxxxxxxxxx krbCanonicalName: ipauser1@xxxxxxxxxxxx creatorsName: uid=admin,cn=users,cn=accounts,dc=example,dc=test modifiersName: uid=admin,cn=users,cn=accounts,dc=example,dc=test createTimestamp: 20190418161833Z modifyTimestamp: 20190418162000Z nsUniqueId: 88b5ac01-61f511e9-8106b510-ee60ba02 ipaUniqueID: 9cabc1f4-61f5-11e9-babc-5254000413f9 uidNumber: 64000001 gidNumber: 64000001 mepManagedEntry: cn=ipauser1,cn=groups,cn=accounts,dc=example,dc=test memberOf: cn=ipausers,cn=groups,cn=accounts,dc=example,dc=test ipaCertMapData: X509:<I>O=EXAMPLE.TEST,CN=Certificate Authority<S>O=EXAMPLE.TE ST,CN=ipauser1 Important to note here, the condition (objectClass=ipaIDObject) is NOT met on this object, yet I suspect IPA may expect this object to be returned on the filter above. There may be other possible errors here too. So I’d like my thoughts here to be checked and evaluated for sanity. I think that the fix is a (partial) patch as follows: @@ -1288,12 +1288,18 @@ grok_filter(struct slapi_filter *f) static int can_skip_filter_test( Slapi_PBlock *pb __attribute__((unused)), + backend *be, struct slapi_filter *f, int scope, IDList *idl) { int rc = 0; + /* Did the search operation tell us NOT to skip the filter test? */ + if (slapi_be_is_flag_set(be, SLAPI_BE_FLAG_DONT_BYPASS_FILTERTEST)) { + return rc; + } + /* Is the ID list ALLIDS ? */ if (ALLIDS(idl)) { /* If so, then can't optimize */ An alternate patch could be to check the DONT_BYPASS flag inside of subtree_candidates to set the allids_before_scopingp (lookup_returned_allids) flags, which may or may not be more consistent with other server behaviours. Testing this as a patch with filter optimisation disabled but bypass set passes basic + filter suites. Enabling filter optimisation and the bypass patch also passes both basic and filter. Another method to verify this, would be to set nsslapd-search-bypass-filter-test: verify in dse.ldif and see if there are verification errors in the output. CONCLUSION: It may be that filter optimisation is highlighting incorrect ipa queries by executing their components in a different yet valid order, rather than actually causing corruption or other issues. It could be that we have allowed such queries to evaluate as true due to a mistake in the application of the filter test in our search code. — Sincerely, William Brown Senior Software Engineer, 389 Directory Server SUSE Labs _______________________________________________ 389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-devel@xxxxxxxxxxxxxxxxxxxxxxx