Am 22.09.2017 um 22:59 schrieb Gregory Farnum: [..] > This is super cool! Is there anything written down that explains this > for Ceph developers who aren't familiar with the workings of Dovecot? > I've got some questions I see going through it, but they may be very > dumb. > > *) Why are indexes going on CephFS? Is this just about wanting a local > cache, or about the existing Dovecot implementations, or something > else? Almost seems like you could just store the whole thing in a > CephFS filesystem if that's safe. ;) This is, if everything works as expected, only an intermediate step. An idea is (https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/status-3) be to use omap to store the index/meta data. We chose a step-by-step approach and since we are currently not sure if using omap would work performance wise, we use CephFS (also since this requires no changes in Dovecot). Currently we put our focus on the development of the first version of librmb, but the code to use omap is already there. It needs integration, testing, and performance tuning to verify if it would work with our requirements. > *) It looks like each email is getting its own object in RADOS, and I > assume those are small messages, which leads me to The mail distribution looks like this: https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-dist Yes, the majority of the mails are under 500k, but the most objects are around 50k. Not so many very small objects. > *) is it really cost-acceptable to not use EC pools on email data? We will use EC pools for the mail objects and replication for CephFS. But even without EC there would be a cost case compared to the current system. We will save a large amount of IOPs in the new platform since the (NFS) POSIX layer is removed from the IO path (at least for the mail objects). And we expect with Ceph and commodity hardware we can compete with a traditional enterprise NAS/NFS anyway. > *) isn't per-object metadata overhead a big cost compared to the > actual stored data? I assume not. The metadata/index is not so much compared to the size of the mails (currently with NFS around 10% I would say). In the classic NFS based dovecot the number of index/cache/metadata files is an issue anyway. With 6.7 billion mails we have 1.2 billion index/cache/metadata files (https://dalgaaf.github.io/CephMeetUpBerlin20170918-librmb/#/mailplatform-mails-nums). Danny _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com