On Thu, 15 Mar 2007 13:09:04 +0000 (GMT), "David Carter" <dpc22@xxxxxxxxx> said: > On Thu, 15 Mar 2007, Rob Mueller wrote: > > > May not be true, but: > > > >> Is it safe? - we calulated that with one billion messages you have a one > >> in 1 billion chance of a birthday collision (two random messages with > >> the same UUID). > > > > Is true. > > Fair enough. > > With hindsight I should probably have defined message UUIDs to be the > full > MD5 hash: 128 bits isn't that much worse than 96 bits per message. What > is > the CPU overhead like for calculating MD5 sums for everything on the fly? Honestly, we don't even notice it in the noise, especially since IO is the main limiting factor on these machines. Also, you only have to do it once per message, at delivery time. I'd be tempted to write an RFC for providing both the MD5 and SHA1 hash via IMAP, and caching them both in the cyrus.cache if not the cyrus.index. Would make client clean-up-after-inconsistency handling, and backups for that matter, much cleaner. > UUIDs started out life as Mailbox UniqueID (64 bits) plus Message UID (32 > bits), hence the size and rather unfortunate name. The hash algorithmn > used to generate mailbox uniqueIDs is a bit basic, which is why I > switched > to generating them on the fly from master. Sure. The UUID code looks really bolted on, which I guess it is. lib/message_uuid* are nice, but the master integration and pass-by-env and stuff is pretty messy! Really, we have already proved that we get by fine without them (given how many were all zero in our system already!) Oh - by the way, don't go rolling out all our patches all at once then reconstructing your mailboxes to get new UUIDs, you'll find UUID mismatches across your replication system really fast! I'm going cleaning that up now :( Bron. -- Bron Gondwana brong@xxxxxxxxxxx ---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html