Re: rados mailbox? (was Re: Ceph for email storage)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 10.07.2012 07:45, schrieb Kristofer:
> Very short answer to this.
> 
> It can work if you direct all email requests for a particular mailbox to a
> single machine. You need to avoid locking between servers as much as possible.
> 
> Messages will need to be indexed, period.  Or else your life will suck.
> 
> Dovecot has a nice writeup on this type of thing; not Ceph specific, but NFS
> related..it can be extrapolated to Ceph or any distributed storage:
> http://wiki.dovecot.org/NFS


>>
>>   - each mail message is a rados object, and immutable.
>>   - each mailbox is an index of messages, stored in a rados object.
>>     - the index consists of omap records, one for each message.
>>     - the key is some unique id
>>     - the value is a copy of (a useful subset of) the message headers
>>
>> This has a number of nice properties:
>>
>>   - you can efficiently list messages in the mailbox using the omap
>>     operations
>>   - you can (more) efficiently search messages (everything but the message
>>     body) based on the index contents (since it's all stored in one object)
>>   - you can efficiently grab recent messages with the omap ops (e.g., list
>>     keys > last_seen_msgid)
>>   - moving messages between folders involves updating the indices only; the
>>     messages objects need not be copied/moved.
>>   - no metadata bottleneck: mailbox indices are distributed across the
>>     entire cluster, just like the mail.
>>   - all the scaling benefits of rados for a growing mail system.
>>
>> I don't know enough about what exactly the mail storage backends need to
>> support to know what issues will come up.  Presumably there are several.
>> E.g., if you delete a message, is the IMAP client expected to discover
>> that efficiently?  And do the mail storage backends attempt to do it
>> efficiently?
>>
>> This also doesn't solve the problem of efficiently indexing/searching the
>> bodies of messages, although I suspect that indexing could be efficiently
>> implemented on top of this scheme.
>>
>> So, a non-trivial project, but probably one that can be prototyped without
>> that much pain, and one that would perform and scale drastically better
>> than existing solutions I'm aware of.
>>
>> I'm hoping there are some motivated hackers lurking who understand the
>> pain that is maildir/mail infrastructure...
>>

Maybe another idea which could be done with few effort would be to mostly reuse
the code from dbmail and make a cephmail version out of it.


-- 

Mit freundlichen Grüßen,


Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux