[BUG] Replication synchronisation fails when message size on disk doesn't match index data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ken,

Attached is a patch I developed after our regular "CheckReplication"
task discovered mismatched message sizes in some mailboxes.  On
examining the files on disk I discovered that they were identical up
to the length of the shorter message, however the longer message was
sometimes the rest of the email, and sometimes contained a bunch of
extra "junk" that looked remarkably like the index file entries for
other messages in the folder.

I tracked it down to the following code:

sync_client.c:1174:        r = mailbox_map_message(mailbox, record->uid, &msg_base, &msg_size);
..
sync_client.c:1190:        prot_printf(toserver, "{%lu+}\r\n", record->size);
sync_client.c:1191:        prot_write(toserver, (char *)msg_base, record->size);

As you can see, it assumes msg_size and record->size are identical without
checking.  If there is corruption on the data partition and the something
has gone wrong with the message file size, then this can cause less than
record->size bytes to be written.

The attached patch sends an IOERROR: syslog message and returns an error code
rather than sending any data for the associated message to the sync_server,
hence alerting the admin to the problem and allowing it to be resolved.

An alternative would be replicating the bogus message file by using msg_size
rather than record->size in the two final lines above.  This has the advantage
of not breaking replication for later messages and causing even weirder
corrupted files on the destination, but on the downside it doesn't inform the
sysadmin.

I guess doing that along with a syslog message is another sane approach, since
you'd still know of the issue but replication would continue.  In our case we're
happy to have replication fail since we have monitoring scripts that will scream
at us when that happens and we'll get in and fix things pronto.

Regards,

Bron.
-- 
  Bron Gondwana
  brong@xxxxxxxxxxx

diff -ur --new-file cyrus-imapd-cvs.orig/imap/sync_client.c cyrus-imapd-cvs/imap/sync_client.c
--- cyrus-imapd-cvs.orig/imap/sync_client.c	2006-07-26 20:03:15.000000000 -0400
+++ cyrus-imapd-cvs/imap/sync_client.c	2006-11-25 01:45:24.000000000 -0500
@@ -1178,6 +1178,12 @@
                    record->uid, mailbox->name);
             return(IMAP_IOERROR);
         }
+        if (msg_size != record->size) {
+            syslog(LOG_ERR,
+                   "IOERROR: message size mismatch for %lu of %s (%d <> %d): %m",
+                   record->uid, mailbox->name, msg_size, record->size);
+            return(IMAP_IOERROR);
+        }
 
         prot_printf(toserver, " %lu %lu %lu {%lu+}\r\n",
 		    record->header_size, record->content_lines,
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

[Index of Archives]     [Cyrus SASL]     [Squirrel Mail]     [Asterisk PBX]     [Video For Linux]     [Photo]     [Yosemite News]     [gtk]     [KDE]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux