> To be honest - it doesn't actually hurt too badly once it's in memory
> cache. The cyrus.cache file isn't generally needed to be entirely
> read, and the secret of mmap is that you only read the bits you need
> as you need them - it's lazily loaded.
I am fully agree with you. But I don't know what Cyrus really reads on SELECT to fulfill the mailbox structure.
Since strace doesn't help to see what mmap reads on SELECT, so I made a test on NFS server.
With a 7MB's mailbox that contains 250 emails. cyrus.index is about 20KB and cyrus.cache is about 350KB.
- on SELECT, nfsstat shows 15 NFS READ => 480KB on-the-wire NFS READ. It seems that both cyrus.cache and cyrus.index are read
- on CLOSE, nfsstat shows 19 NFS WRITE and strace shows that both files are rewritten
With a 6GB's mailbox that contains almost 100.000 emails. cyrus.index is about 8MB and cyrus.cache is about 120MB
- on SELECT nfsstat shows 300 NFS READ => 9600KB on-the-wire NFS READ. OK it is less that the size of cyrus.index and cyrus.cache
- on CLOSE nfsstat shows 4105 NFS READ and 4144 NFS WRITE => 2x130MB on-the-wire NFS.
In such situation mmap doesn't help because everything is read and write. I hope this behaviour can be optimized.
Since strace doesn't help to see what mmap reads on SELECT, so I made a test on NFS server.
With a 7MB's mailbox that contains 250 emails. cyrus.index is about 20KB and cyrus.cache is about 350KB.
- on SELECT, nfsstat shows 15 NFS READ => 480KB on-the-wire NFS READ. It seems that both cyrus.cache and cyrus.index are read
- on CLOSE, nfsstat shows 19 NFS WRITE and strace shows that both files are rewritten
With a 6GB's mailbox that contains almost 100.000 emails. cyrus.index is about 8MB and cyrus.cache is about 120MB
- on SELECT nfsstat shows 300 NFS READ => 9600KB on-the-wire NFS READ. OK it is less that the size of cyrus.index and cyrus.cache
- on CLOSE nfsstat shows 4105 NFS READ and 4144 NFS WRITE => 2x130MB on-the-wire NFS.
In such situation mmap doesn't help because everything is read and write. I hope this behaviour can be optimized.
> There's no real answer if you're doing a sort on the messages,
Yes I am worried about IMAP SORT and some poors IMAP clients
> unless you go to multiple indexes (a la database engines). That's
> a whole different ballgame - but the the multiplier factor gets
> higher. For sane sizes of N (up to 20-30 thousand messages) the
> O(N) of the way Cyrus does it is cheaper than a more complex
> database.
I don't think about database but about MapReduce.
Sébastien
Sébastien
---- Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/