On Sat, 9 Jun 2007, Rob Mueller wrote: >> I run it directly, outside of master. That way when it crashes, it >> can be easily restarted. I have a script that checks that it's >> running, that the log file isn't too big, and that there are no log- >> PID files that are too old. If anything like that happens, it pages >> someone. > > Ditto, we do almost exactly the same thing. And for that matter, so I do. > I think there's certain race conditions that still need ironing out, > because rerunning sync_client on the same log file that caused a bail > out usually succeeds the second time. I suspect that the problem is with mailbox renames, which are not atomic and can take some time to complete with very large mailboxes. sync_client retries a number of times and then bails out. if (folder_list->count) { int n = 0; do { sleep(n*2); /* XXX should this be longer? */ ... } while (r && (++n < SYNC_MAILBOX_RETRIES)); if (r) goto bail; } This was one of the most significant compromises that Ken had to make when integrating my code into 2.3. My original code cheats, courtesy of two other patches: HERMES_FAST_RENAME: Translates mailbox rename into filesystem rename() where possible. Useful because sync_client chdir()s into the working directory. Would be less useful in 2.3 with split metadata. HERMES_SYNC_SNAPSHOT: If mailbox action fails, promote to user action (no shared mailboxes) If user action fails then lock user out of the mboxlist and try again. Together with my version of delayed expunge this pretty much guarantees that things aren't moving around under sync_client's feet. Its been an awful long time (about a year?) since I last had a sync_client bail out. We are moving to 2.3 over the summer (initially using my own original replication code), so this is something that I would like to sort out. Any suggestions? -- David Carter Email: David.Carter@xxxxxxxxxxxxx University Computing Service, Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html