On Tue, 13 Nov 2007 09:12:18 +0100 (CET), "Simon Matter" <simon.matter@xxxxxxxxx> said: > > On Mon, Nov 12, 2007 at 12:34:34AM +1100, Bron Gondwana wrote: > >> Anyway - here it is. A "recovery()" that copes if the logstart > >> parameter in the database header is wrong. No, I don't have a > >> clue how that happened unless lseek() lied. Maybe it sometimes > >> lies, I don't know. I'll be writing a test case for that soon > >> too! > > > > I have some more suspicions now, but I wrote it all up in the > > patch header, so here's the bugfixes only patch, a "robustness" > > extras patch and the tool I used for testing. > > > > Ken, I know you've done some other work on the file changing > > types. I'd like to be even more agressive and convert just > > about everything to bit32 and also rename some variables, but > > I restricted myself in this to only fixing the most ugly case: > > offset = htonl(offset). > > > > These patches are all against 2.3.10 (in this order), and may > > need some fuzz fixing to apply against your latest CVS thanks > > to those changes - sorry I haven't done that, but it's getting > > on 1am for me, and I've just finished doing a lot of testing > > and paring these down to simple and clear patches that don't > > touch more than they need to fix the issues. > > > > > cyrus-skiplist-bugfixes-2.3.10.diff: > > > > > cyrus-skiplist-robustify-2.3.10.diff: > > > > Hi Bron, > > I didn't have much troubles with skiplist over the years and it has been > a > blessing since moving away from BDB. But I did have a few issues with > broken skiplist files so your patches are very welcome. I have included > the patches in my private rpm packages to try how they work. Do you > recommend both for general consumption? They've been running for 24 hours on all our production systems with no ill effects :) Seriously - yes, I do. They are quite short, and they're the culmination of about 3 days of pretty heavy work over the weekend and Monday after we lost a mailboxes.db on our busiest store to one of these bugs (my wife and kids were getting ready to kill me for neglecting them towards the end, I'm sure!) I build multiple different patches and tested them over that time. I also wrote a Perl module that can read skiplist files natively and tested some things with that as well. These couple of patches I have posted are the best bits of those distilled down into the simplest and clearest small set of changes. They've been hit pretty hard with the hammer scripts. I've also got another patch which I'll attach here that I wrote today which re-tunes the "how often to checkpoint" calculation. I want our mailboxes.db files especially to checkpoint more frequently, as that will make them less "seeky" - which will help with cachelines at least. We have enough memory (and always plenty free) that I'm sure every page is hot in cache within a few minutes. The seekyness is mainly an issue with clients doing "LIST", which our web interface does at login, so we want it to be as quick as possible. As for seen files - well, they tend to be small and frequently updated, so they'll just checkpoint about 4 times as often now. Will save a tiny bit of disk space but more interestingly reduce the memory footprint to keep them all in cache. Bron. -- Bron Gondwana brong@xxxxxxxxxxx
Skiplist tuning With random changes to a mailboxes.db file, it could be nearly 100% random seeks before it recompressed. A seen file would need to reach 16kb before even considering re-compressing, with a real data length of just a couple of hundred bytes. This patch reduces the limits to: 4kb overhead 120% rather than 200% of current "sorted" size. Index: cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c =================================================================== --- cyrus-imapd-2.3.10.orig/lib/cyrusdb_skiplist.c 2007-11-12 23:53:34.000000000 -0500 +++ cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c 2007-11-12 23:57:38.000000000 -0500 @@ -302,7 +302,7 @@ SKIPLIST_VERSION = 1, SKIPLIST_VERSION_MINOR = 2, SKIPLIST_MAXLEVEL = 20, - SKIPLIST_MINREWRITE = 16834 /* don't rewrite logs smaller than this */ + SKIPLIST_MINREWRITE = 4096 /* don't rewrite logs smaller than this */ }; #define BIT32_MAX 4294967295U @@ -1392,8 +1392,8 @@ } done: - /* consider checkpointing */ - if (!r && tid->logend > (2 * db->logstart + SKIPLIST_MINREWRITE)) { + /* consider checkpointing (journal is 20% of data length) */ + if (!r && tid->logend > (12 * db->logstart / 10 + SKIPLIST_MINREWRITE)) { r = mycheckpoint(db, 1); }
---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html