On 1/18/21 5:04 PM, Jan Tomasek wrote:
Hi Thierry,
On 15. 01. 21 11:06, thierry bordaz wrote:
Would you be able to run those commands:
dbscan -f /var/lib/dirsrv/<instance>/db/cesnet_cz /nsuniqueid.db -k
=fffffff-fffffff-fffffff-fffffff -r =fffffff-fffffff-fffffff-fffffff
This seqfaults:
root@cml3:~# dbscan -f
/var/lib/dirsrv/slapd-cml3/db/test/nsuniqueid.db -k
=fffffff-fffffff-fffffff-fffffff -r =fffffff-fffffff-fffffff-fffffff
Can't find key '=fffffff-fffffff-fffffff-fffffff'
Segmentation fault
strace:
openat(AT_FDCWD, "/var/lib/dirsrv/slapd-cml3/db/test/nsuniqueid.db",
O_RDONLY) = 3
fcntl(3, F_GETFD) = 0
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
fstat(3, {st_mode=S_IFREG|0600, st_size=16384, ...}) = 0
mmap(NULL, 16384, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f51149b3000
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0x1), ...}) = 0
write(1, "Can't find key '=fffffff-fffffff"..., 50Can't find key
'=fffffff-fffffff-fffffff-fffffff'
) = 50
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR,
si_addr=0x7fff3c000000} ---
+++ killed by SIGSEGV +++
Segmentation fault
My fault the key (-k) was missing some 'f' it should be
dbscan -f /var/lib/dirsrv/slapd-cml3/db/test/nsuniqueid.db -k
=ffffffff-ffffffff-ffffffff-ffffffff -r
I've created simple test suffix (see ldif) and problem persist :(
Error is now:
[18/Jan/2021:15:36:07.639103043 +0100] - ERR - _entryrdn_insert_key -
Same DN (dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test)
is already in the entryrdn file with different ID 4. Expected ID is 6.
[18/Jan/2021:15:36:07.639405490 +0100] - ERR - index_addordel_entry -
database index operation failed BAD 1023, err=9999 Unknown error 9999
[18/Jan/2021:15:36:07.794625784 +0100] - ERR - NSMMReplicationPlugin -
_replica_configure_ruv - Failed to create replica ruv tombstone entry
(dc=test); LDAP error - 1
[18/Jan/2021:15:36:07.794954251 +0100] - ERR - NSMMReplicationPlugin -
replica_new - Unable to configure replica dc=test:
I tried that (on master branch) but did not produce this failure during
reindex.
root@cml3:~# dbscan -f /var/lib/dirsrv/slapd-cml3/db/test/nsuniqueid.db
=d5658282-599911eb-af359663-f13d537d
=d5658283-599911eb-af359663-f13d537d
=d5658284-599911eb-af359663-f13d537d
=d5658285-599911eb-af359663-f13d537d
root@cml3:~# dbscan -f /var/lib/dirsrv/slapd-cml3/db/test/id2entry.db
-K 4
id 4
rdn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsUniqueId: ffffffff-ffffffff-ffffffff-ffffffff
nsds50ruv: {replicageneration} 60059bd3000000010000
nsds50ruv: {replica 1 ldap://cml3.cesnet.cz:389}
60059bdd000200010000 60059c66
000000010000
dc: test
nscpEntryDN: dc=test
nsruvReplicaLastModified: {replica 1 ldap://cml3.cesnet.cz:389}
60059c66
nsds5agmtmaxcsn:
dc=test;test-ldap31;ldap31.cesnet.cz;636;65535;60059c66000000
010000
nsds5agmtmaxcsn:
dc=test;test-ldap32;ldap32.cesnet.cz;636;65535;60059c66000000
010000
The entry (RUV) 'nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test'
may take sometime to appear. The time for the replica to flush the in
memory RUV on a DB entry.
root@cml3:~# dbscan -f /var/lib/dirsrv/slapd-cml3/db/test/id2entry.db
-K 6
Can't set cursor to returned item: BDB0073 DB_NOTFOUND: No matching
key/data pair found
free(): invalid pointer
Aborted
After I run reindex on backend:
# root@cml3:~# dsctl cml3 db2index test
fffffff... entry shows in nsuniqueid.db
root@cml3:~# dbscan -f /var/lib/dirsrv/slapd-cml3/db/test/nsuniqueid.db
=d5658282-599911eb-af359663-f13d537d
=d5658283-599911eb-af359663-f13d537d
=d5658284-599911eb-af359663-f13d537d
=d5658285-599911eb-af359663-f13d537d
=ffffffff-ffffffff-ffffffff-ffffffff
At this step, db2index and restart did not generate the
'_entryrdn_insert_key' error message.
Now is server able to start. Need reinitialization of both replicas
and after reinitialization works. Untill next complete reindex. ;)
I've tested once again with fresh db. record rdn:
nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff appears in
nsuniqueid.db after reinitialization of both replicas is completed.
Yes there is a small delay before it appears
Isn't my problem related to this:
https://github.com/389ds/389-ds-base/issues/273 ?
My system is Debian Buster and 389 DS is in version 1.4.4.9 taken from
Debian Bullseye. If I can provide some more debug info please let me
know.
Having apply the same steps without that bug, I think 1.4.4.9 is likely
missing some fixes vs master branch.
I do not recall recent (1.4.x) problem around reindex
regards
thierry
I hope I can operate servers this without doing reindex on all
attributes, but it would be nice if this will be fixed.
Thanks
_______________________________________________
389-users mailing list -- 389-users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@xxxxxxxxxxxxxxxxxxxxxxx