On 09/13/2013 02:39 PM, David Boreham wrote:
On 9/13/2013 2:18 PM, Rich Megginson wrote:
On 09/12/2013 07:08 PM, David Boreham wrote:
On 9/11/2013 11:41 AM, Howard Chu wrote:
Just out of curiosity, why is keeping a count per key a problem? If
you're using BDB duplicate key support, can't you just use
cursor->c_count() to get this? I.e., BDB already maintains key
counts internally, why not leverage that?
afaik you need to pass the DB_RECNUM flag at DB creation time to get
record counting behavior, and it imposes a performance and
concurrency penalty on writes. Also afaik 389DS does not set that
flag except on VLV indexes (which need it, and coincidentally were
the original reason for the feature being added to BDB).
I'm using bdb 4.7 on RHEL 6.
Looking at the code, it appears the dbc->count method for btree is
__bamc_count() in bt_cursor.c. I'm not sure, but it looks as though
this function has to iterate each page counting the duplicates on
each page, which makes it a non-starter. Unless I'm mistaken, it
doesn't look as though it keeps a counter on each update, then simply
returns the counter. I don't see any code which would make the
behavior different depending on if DB_RECNUM is used when the
database is created.
The DB_RECNUM count feature is not accessed via dbc->count() but
through the dbc->c_get() call, passing DB_GET_RECNO, positioning at
the last key. You do also need to use nested btrees for it to count
the dups, afaik (but we're doing that in the DS indexes already I
believe).
I wrote a small bdbtest.py script which uses the python bdb interface.
https://github.com/richm/scripts/blob/master/bdbtest.py
This creates an env, opens a db with
bsddb.db.DB_DUPSORT|bsddb.db.DB_RECNUM, adds several non-dup and dup
records, opens a cursor and iterates them. This is the output:
open dbenv in /var/tmp/dbtest
open db /var/tmp/dbtest/dbtest.db4
no txn records
key=key0 val=data0
extra=('', '\x01\x00\x00\x00')
<snip>
key=key9 val=data9
extra=('', '\n\x00\x00\x00')
key=multikey val=multidata0
extra=('', '\x0b\x00\x00\x00')
<snip>
key=multikey val=multidata9
extra=('', '\x0b\x00\x00\x00')
The extra is the str() output of cur.get(bsddb.db.DB_GET_RECNO)
So for all of the dup records, the recno is the same '\b' == 11?
I'm probably missing something, but how do I use this to get the number
of duplicates?
--
389-devel mailing list
389-devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-devel
--
389-devel mailing list
389-devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-devel