Steve Atkins wrote:
So, yeah, you're right. Generally, email is too complex to deal with
in the database as anything other than an opaque bytea blob, along
with some metadata
Only because that's the choice made by dbmail. As an IMAP server, it
doesn't _have_ to do more. The downside is that the database is not as
useful as it could be.
I happen to have developed my own OSS project on exactly this idea: to
have a database of mail with contents in normalized form and
ready-to-be-queried. An picture of the schema can be seen here:
http://www.manitou-mail.org/articles/db-diagram.html
the architecture being this:
http://www.manitou-mail.org/schemas/schema1.png
There's nothing particularly remarkable about the schema, except that
there is no trace left of the initial encapsulation of the data inside
an RFC822 message and its associated rules about structure and
encoding.
The next step has been to write a MUA that talks directly in SQL to the
database, and the resulting speed and efficiency is much better than
with traditional IMAP-based MUAs.
As an example related to search, I have this 10Gb database containing
600k mails, and hundreds of results for a full-text search typically
come back to the MUA in a couple of seconds, Gmail-like, on a low-grade
server to which I'm remotely connected through an SSH tunnel. SQL is so
much better without an IMAP layer on top of it...
Now, my dedicated MUA isn't as feature-rich as other popular mailers,
and it can't be used offline despite being a desktop app, and has other
deficiencies, but other mailer/server combinations come with their own
sets of problems and inadequacies, too :)
Regards,
--
Daniel
PostgreSQL-powered mail user agent and storage:
http://www.manitou-mail.org