> > > > I have a question / concern though. I thought that we want dbscan 2 > > ldif for emergency recovery scenarios when all else has gone bad and > > assuming that id2entry is still readable. In the approach you > > described we make the assumption that the parentid index is readable > > as well. So we depend on two files instead of one for exporting the > > database. Does this matter or we don't care at all? > There are two scenarios here in my opinion. Backup, and emergency > backup :-) As I've previously stated: performance is important. It > should not take forever to process a 100 million entry database. I > think the tool should use multiple index files (id2entry + friends) if > we can generate the LDIF faster. But, if some of those indexes are > corrupted, then we need an alternate algorithm to generate it just from > id2entry. Also, if we are dealing with a corrupted db, then performance > is not important, recovery is. So if we can do it fast, do it, > otherwise grind it out. > > All that being said there is something we need to consider, which I > don't have an answer for, and that is when databases do get corrupted > which files typically get corrupted? Is it indexes, or is it id2entry? > To be honest database corruption doesn't happen very often, but the tool > should be smart enough to realize that the data could be inaccurate. > Perhaps a parent could be missing, etc. So the tool should be robust > enough to use multiple techniques to complete an entry, and if it can't > it should log something, or better yet create a rejects file that an > Admin can take and repair manually. > > I know this is getting more complicated, but we need to keep these > things in mind. > > Regards, > Mark > > With the current design of id2entry and friends, we can't automatically detect this so easily. I think we should really just have a flag on dbscan that says "ignore everything BUT id2entry" and recover all you can. We should leave this to a human to make that call. If our database had proper checksumming of content and pages, we could detect this, but today that's not the case :( -- Sincerely, William Brown Software Engineer Red Hat, Australia/Brisbane
Attachment:
signature.asc
Description: This is a digitally signed message part
_______________________________________________ 389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx