Re: LMDB vs BDB where locks are exhausted

Ludwig Krispenz <krispenz@xxxxxxxxxxx> · Tue, 23 Jun 2020 20:43:36 +0200

On 23.06.20 19:19, Mark Reynolds wrote:

On 6/23/20 12:22 PM, David Boreham wrote:

On 6/23/2020 10:07 AM, Mark Reynolds wrote:

In 389 what we are seeing is that our backend txn plugins are doing 
unindexed searches, but I would not call it a bug.

The unindexed search is fine per se (although probably not a great 
idea if you want the op the plugin hooked to complete quickly).

What's not fine is that all the DB reads under that search should be 
done in the same transaction with strong isolation.

First I'm not that intimately familiar with this issue, Thierry and 
Ludwig did most of that investigation.  But this happens during a 
modify operation that triggers some BE txn plugins that do searches 
and updates under the same parent transaction.  So under these 
conditions is when it just starts consuming a ton of db locks.
If a transactional operation (eg modify) triggers a search by a plugin 
it already holds a coupke of page locks as write locks. If the search 
would try to access this pages without using the txn of the parent it 
would have to wait - and the whole operation would self deadlock. So all 
db accesses inside a txn need to use this txn directly or as a paent txn.

Unindexed searches by themselves do not cause this issue, it's when we 
are updating the database under the same txn.  So the mod takes a lock 
on a db page, then we call the be postop plugins, which in turn starts 
doing these expensive searches and updates - that is when the db lock 
issue pops up.  I seem to recall from previous similar cases that this 
"mod update" involved a very large static group, and the RI or 
memberOf plugin doing its work. Maybe Thierry recalls some of the past 
cases?

It's really a configuration/indexing issue.  But yes, there are long 
running operations/txns in regards to many plugins doing a lot of 
things while the database is being updated in the same nested 
operation.  Now when these internal searches are properly indexed 
the db lock issue completely goes away.

If missing an index were to result in poor performance, agreed -- 
it's a configuration issue. The server process exiting seems quite an 
extreme consequence.
It's not exactly crashing, but the db can get corrupted and it needs 
to be reinitialized.  That sounds like a libdb bug to me :-) Running 
out of db locks should not corrupt the database.

Wondering if this is the result of an old fix for a deadlock problem 
(bringing the internal op under the main transaction to cure the 
deadlock)?
Maybe :-)  Haven't looked at that code in quite a few years...

How is a regular (non-internal) unindexed search run? Surely that 
doesn't burn through one lock per page touched?

No it doesn't.  See my comment above, standalone unindexed searches do 
not trigger this issue.

Mark

_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx

_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx