[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Original source file name?
On May 31, 2001 at 20:54, J C Lawrence wrote:
> Good point. As the ultimate goal is to shove the entire message
> base into an SQL DB (I've got users begging for things like
> thread-bounded searches and the ability to gen meta views of an
> archive), I'll probably head that way.
>
> While its a gruesome hack, I'm ultimately looking to use MHonArc as
> a front end processor which writes scripts as output which are then
> executed to input the message and all its particulars inputs into an
> SQL DB. What I haven't figured out yet is how to properly extract
> the thread linkings for input into the DB, as well as how to
> effectively (ie scalably) provide the thread database to MHonArc
> when archiving a message (we're talking hundreds of thousands of
> messages, possibly small order millions).
Its on my TODO list to allow callback hooks during MHonArc processing.
The problem is that to allow a decent callback API, some of the internal
functions need changing. Something for probably a 2.5 release (whenever
that is).
With a hook, you can store the message-ids and references/in-reply-to
data in a DB, and then compute the threads from that. This is
basically what MHonArc does.
> At that point my main interests in MHonArc are its excellant MIME
> and charset handling (damned fine job BTW). I'd like to also use it
> to build the thread graph rather than dynamically building it off
> the References/In-Reply-To headers dynamically as MHonArc properly
> handles the matching-subject thread hits.
With the current code base, you can access the thread listing order.
There are multiple approaches, but one is creating a custom mhonarc
that does a dump of thread data after an archive update in some format
you need. Two main variables are created when generating the thread
data: @TListOrder and %Index2TLoc. The first is a list of message
indexes in the order to be rendered on a thread index page. The
second is a hash that maps a message index the ordinal thread index
position (useful in resource variable resolution).
Also generated is the %ThreadLevel hash. This maps a message index
to the thread depth of the message. A depth of 0 means it is a
root-level message. Therefore, with @TListOrder and %ThreadLevel one
can infer the thread tree structure.
These structures are a sequential way of representing message threads,
but is conduscive to generating the HTML thread index pages since
that is done in a sequential manner. Also, in Perl 4 days, doing
complex tree structures was a non-trivial task.
BTW, the following is a snippet from mhinit.pl:
## Following variables used in thread computation
@ThreadList = (); # List of messages visible in thread index
@NotIdxThreadList
= (); # List of messages not visible in index
%HasRef = (); # Flags if message has references (Keys = indexes)
# (Values = reference message indexes)
%HasRefDepth = (); # Depth of reference from HasRef value
%Replies = (); # Msg-ids of explicit replies (Keys = indexes)
%SReplies = (); # Msg-ids of subject-based replies (Keys = indexes)
%TVisible = (); # Message visible in thread index (Keys = indexes)
$DoMissingMsgs = 0; # Flag is missing messages should be noted in index
Unfortunately, my memory needs refreshing on all the threading stuff,
so I'm probably forgetting something. The multi-page index support
does complicate some of the stuff (hence the visible/non-visible
comments).
--ewh
[Index of Archives]
[Bugtraq]
[Yosemite News]
[Mhonarc Home]