Re: Truncated text during Xapian indexing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for your reply, that was very interesting and helpful!

--On 15. Februar 2018 um 16:12:23 +0100 Robert Stepanek <rsto@xxxxxxxxxxxxxxxx> wrote:

Just out of curiosity, how is the mapping between a Xapian docid and a
message file on disk achieved? I played around with xapian-delve and the
Perl example simplesearch.pl. When I search a term, I get a list of
docid's, but how do I know which message that is?

In 3.x, Cyrus search stores an internal unique message id, called guid,
as docid in Xapian. The guid currently is a SHA-1 hash of the raw
message, allowing for deduplication and to avoid re-indexing already seen
messages. The conversations.db of a user maps this guid to a list of
mailbox:UID pairs.

Off the top of my head, there currently isn't an "official" way in Cyrus
to retrieve the mailbox:UID list for a given guid outside the Cyrus
process. Depending on your use case, you could either: 1.) build your
custom mapper on imap/conversations.h, 2.) use cvt_cyrusdb to dump the
contents of a conversations.db into plain text.

FWIW, that conversion is so "lossy" as to be useless. But it was really only curiosity, so it doesn't matter.

Cheers,
Sebastian
--
   .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:.
                .:.Regionales Rechenzentrum (RRZK).:.
  .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:.

Attachment: pgpY9cQvqKq0k.pgp
Description: PGP signature

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

[Index of Archives]     [Cyrus SASL]     [Squirrel Mail]     [Asterisk PBX]     [Video For Linux]     [Photo]     [Yosemite News]     [gtk]     [KDE]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux