Re: Deref plugin entries == NULL #4525

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On 14 Jan 2021, at 21:32, Pierre Rogier <progier@xxxxxxxxxx> wrote:
> 
> Hi William, 
> 
> > It's a scenario we will need to fix via your BE work because of the MVCC transaction model that 
> > LMDB will force us to adopt :)
>   
> As I see things in the early phases the lmdb read txn will probably only be managed at the db plugin level rather than at backend level. That means that we will have the same inconsistency risk than today (i.e as if using bdb and the implicit txn).  
> The txn model redesign you are speaking about should only occur in one of the last phases (once bdb does no more coexists with lmdb).
> It must be done because it could provide a serious performance boost for read operations (IMHO, In most cases we could avoid to duplicate the db data)
> But we should not do it while bdb is still around because of the risk of lock issue and excessive retries.

Yep, agreed. It will be needed for a large read performance boost, but just to prevent exactly this kind of issue. We should be able to move to a model where everything is always within a transaction.

We could introduce it earlier and have the read txns be a no-op for bdb and continue using the implied transactions that we currently have, but also perhaps there is then no benefit to doing this earlier :) 

> 
> Note I put a phasing section in
> https://directory.fedoraproject.org/docs/389ds/design/backend-redesign-phase3.html#phasing
> explaining that. But I guess I should move it within Ludwig's document that englobs it.
> 
> Pierre
> 
> On Thu, Jan 14, 2021 at 12:01 AM William Brown <wbrown@xxxxxxx> wrote:
> 
> 
> > On 13 Jan 2021, at 21:24, Pierre Rogier <progier@xxxxxxxxxx> wrote:
> > 
> > Thank you Willian,
> > So far your scenario (entry found when reading base entry but no more existing when computing the candidates) is the only one that matches the symptoms.
> 
> It's a scenario we will need to fix via your BE work because of the MVCC transaction model that LMDB will force us to adopt :) 
> 
> > And that triggered a thought: 
> >  We cannot do anything for SUBTREE and ONE_LEVEL searches
> >   because the fact that the base entry id is not in the candidate may be normal
> >  but IMHO we should improve the BASE search case.
> > In this case the candidate list is directly set to the base entry id
> >  ==> if the candidate entry (in ldbm_back_next_search_entry) is not found and the scope is BASE then we should return a LDAP_NO_SUCH_ENTRY error ..
> 
> I suspect that Mark has seen this email and submitted a PR to resolve this exact case :) 
> 
> 
> > 
> >        Pierre
> > 
> > 
> > On Wed, Jan 13, 2021 at 1:45 AM William Brown <wbrown@xxxxxxx> wrote:
> > Hey there,
> > 
> > https://github.com/389ds/389-ds-base/pull/4525/files
> > 
> > I had a look and I can see a few possible contributing factors, but without a core and the exact state I can't be sure if this is correct. It's all just hypothetical from reading the code.
> > 
> > 
> > The crash is in deref_do_deref_attr() which is called as part of deref_pre_entry(). This is the SLAPI_PLUGIN_PRE_ENTRY_FN which is called by "./ldap/servers/slapd/result.c:1488:    rc = plugin_call_plugins(pb, SLAPI_PLUGIN_PRE_ENTRY_FN);"
> > 
> > 
> > I think what's important here is that the search is conducted in ./ldap/servers/slapd/opshared.c:818  rc = (*be->be_search)(pb);  Is *not* in a transaction. That means that while the single search in be_search() is consistent due to an implied transaction, the subsequent search in deref_pre_entry() is likely conducted in a seperate transaction. This allows for other operations to potentially interleave and cause changes - modrdn or delete would certainly be candidates to cause a DN to be remove between these two points. It would be extremely hard to reproduce as a race condition of course. 
> > 
> > 
> > A question you asked is why don't we get a "no such entry" error or similar? I think that this is because build_candidate_list in ldbm_search.c doesn't actually create an error if the base_candidates list is empty, because an IDL is allocated with a value of 0 (no matching entries). this allows the search to proceed, and there are no errors, and the result set is set to NULL with size 0. I can't see where LDAP_NO_SUCH_OBJECT is set in this process, but without looking further into it, my suspicion is that entries of size 0 WONT return an error condition to internal_search_pb, so it's valid for this to be empty.
> > 
> > Anyway, again, this is just reading the code for 20 minutes, and is not a complete in depth investigation, but maybe it's some ideas about what happened?
> > 
> > Hope it helps :) 
> > 
> > 
> > 
> > —
> > Sincerely,
> > 
> > William Brown
> > 
> > Senior Software Engineer, 389 Directory Server
> > SUSE Labs, Australia
> > _______________________________________________
> > 389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx
> > To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
> > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > List Archives: https://lists.fedoraproject.org/archives/list/389-devel@xxxxxxxxxxxxxxxxxxxxxxx
> > 
> > 
> > -- 
> > --
> > 
> > 389 Directory Server Development Team
> > _______________________________________________
> > 389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx
> > To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
> > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > List Archives: https://lists.fedoraproject.org/archives/list/389-devel@xxxxxxxxxxxxxxxxxxxxxxx
> 
> —
> Sincerely,
> 
> William Brown
> 
> Senior Software Engineer, 389 Directory Server
> SUSE Labs, Australia
> _______________________________________________
> 389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/389-devel@xxxxxxxxxxxxxxxxxxxxxxx
> 
> 
> -- 
> --
> 
> 389 Directory Server Development Team
> _______________________________________________
> 389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/389-devel@xxxxxxxxxxxxxxxxxxxxxxx

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs, Australia
_______________________________________________
389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-devel@xxxxxxxxxxxxxxxxxxxxxxx




[Index of Archives]     [Fedora Directory Announce]     [Fedora Users]     [Older Fedora Users Mail]     [Fedora Advisory Board]     [Fedora Security]     [Fedora Devel Java]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Mentors]     [Fedora Package Review]     [Fedora Art]     [Fedora Music]     [Fedora Packaging]     [CentOS]     [Fedora SELinux]     [Big List of Linux Books]     [KDE Users]     [Fedora Art]     [Fedora Docs]

  Powered by Linux