Re: Strategy proposal for making DB dump in LDIF format from dbscan

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> >
> > I have a question / concern though. I thought that we want dbscan 2
> > ldif for emergency recovery scenarios when all else has gone bad and
> > assuming that id2entry is still readable. In the approach you
> > described we make the assumption that the parentid index is readable
> > as well. So we depend on two files instead of one for exporting the
> > database. Does this matter or we don't care at all?
> There are two scenarios here in my opinion.  Backup, and emergency
> backup :-)  As I've previously stated: performance is important.  It
> should not take forever to process a 100 million entry database.  I
> think the tool should use multiple index files (id2entry + friends) if
> we can generate the LDIF faster.  But, if some of those indexes are
> corrupted, then we need an alternate algorithm to generate it just from
> id2entry.  Also, if we are dealing with a corrupted db, then performance
> is not important, recovery is.  So if we can do it fast, do it,
> otherwise grind it out.
> 
> All that being said there is something we need to consider, which I
> don't have an answer for, and that is when databases do get corrupted
> which files typically get corrupted?  Is it indexes, or is it id2entry? 
> To be honest database corruption doesn't happen very often, but the tool
> should be smart enough to realize that the data could be inaccurate. 
> Perhaps a parent could be missing, etc.  So the tool should be robust
> enough to use multiple techniques to complete an entry, and if it can't
> it should log something, or better yet create a rejects file that an
> Admin can take and repair manually.
> 
> I know this is getting more complicated, but we need to keep these
> things in mind.
> 
> Regards,
> Mark
> >

With the current design of id2entry and friends, we can't automatically
detect this so easily. I think we should really just have a flag on
dbscan that says "ignore everything BUT id2entry" and recover all you
can. We should leave this to a human to make that call.

If our database had proper checksumming of content and pages, we could
detect this, but today that's not the case :( 

-- 
Sincerely,

William Brown
Software Engineer
Red Hat, Australia/Brisbane

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx

[Index of Archives]     [Fedora Directory Announce]     [Fedora Users]     [Older Fedora Users Mail]     [Fedora Advisory Board]     [Fedora Security]     [Fedora Devel Java]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Mentors]     [Fedora Package Review]     [Fedora Art]     [Fedora Music]     [Fedora Packaging]     [CentOS]     [Fedora SELinux]     [Big List of Linux Books]     [KDE Users]     [Fedora Art]     [Fedora Docs]

  Powered by Linux