On Sun, Nov 11, 2007 at 02:31:45PM +1100, Bron Gondwana wrote: > * the one buggy skiplist I actually still have a copy of, the "logstart" > value in the header is wrong, causing recovery to fail with only a few > of the records still reachable because it hits another INORDER record > rather than the ADD record and drops out. I've got the monitoring > system set up to let me know if it finds any other skiplist errors and > take a copy of the offending file. Ok, so attached is the patch that deals with this one particular case and adds a bit more debugging as well. It will log an ERROR if it sees the case that would previously cause a total bailout. NOTE - there's still heaps of "add this value read from file to this mmap base pointer and dereference with impunity" scattered through this code, which is the sort of accident waiting to happen I've been trying to remove from everything else along with the CRC32 stuff I keep promising. I also wrote a much bigger patch which you don't get to see yet because someone might be silly enough to try and use it. It does some variable renaming, refactoring, etc. At least there's some signed vs unsigned fuzziness I'm still not happy with here, and a naming policy that made it clearer when a variable contained network-byte-order and when it contained host-byte-order would be peachy too. My big patch does that. Sadly it also infinite loops during recovery, which suggests I did something pretty stupid in it somewhere! Too late to debug that tonight. Anyway - here it is. A "recovery()" that copes if the logstart parameter in the database header is wrong. No, I don't have a clue how that happened unless lseek() lied. Maybe it sometimes lies, I don't know. I'll be writing a test case for that soon too! Bron.
Index: cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c =================================================================== --- cyrus-imapd-2.3.10.orig/lib/cyrusdb_skiplist.c 2007-11-11 07:44:25.000000000 -0500 +++ cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c 2007-11-11 07:59:47.000000000 -0500 @@ -1206,7 +1206,7 @@ lseek(tp->syncfd, tp->logend, SEEK_SET); r = retry_writev(tp->syncfd, iov, num_iov); if (r < 0) { - syslog(LOG_ERR, "DBERROR: retry_writev(): %m"); + syslog(LOG_ERR, "DBERROR: retry_writev() %s: %m", db->fname); myabort(db, tp); return CYRUSDB_IOERROR; } @@ -1926,20 +1926,13 @@ /* reset the data that was written INORDER by the last checkpoint */ offset = DUMMY_OFFSET(db) + DUMMY_SIZE(db); - while (!r && (offset < (bit32) db->logstart)) { + while (!r && (offset < db->map_size) + && (TYPE(db->map_base+offset) == INORDER)) { ptr = db->map_base + offset; offsetnet = htonl(offset); db->listsize++; - /* make sure this is INORDER */ - if (TYPE(ptr) != INORDER) { - syslog(LOG_ERR, "DBERROR: skiplist recovery: %04X should be INORDER", - offset); - r = CYRUSDB_IOERROR; - continue; - } - /* xxx check \0 fill on key */ /* xxx check \0 fill on data */ @@ -1980,6 +1973,11 @@ } } + if (offset != db->logstart) { + syslog(LOG_ERR, "DBERROR: recovery logstart %s: %04X not %04X", + db->fname, offset, db->logstart); + } + /* zero out the remaining pointers */ if (!r) { for (i = 0; !r && i < db->maxlevel; i++) {
---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html