Re: Can MHonArc find threads of deleted parents?

On May 6, 2005 at 02:06, East Coast Coder wrote:

> I should add that storage is not such a concern (as it is only an
> increase of 2x over the html archives anyway), but processing is...

Where processing may be an issue is if you use a single mhonarc
archive for all messages.  Mhonarc archives do not scale well as
the number of messages increase, but I have had reports of users
with a single archive over 60,000 messages.

The common solution to get around this problem is to use period-based
archives.  For example, a mailing list may be broken up by multiple
mhonarc archives, where each archive represents a well-known time
period, like a month.  Mharc uses this approach.

Unfortunately, such a scheme is not nice for threads the span across
multiple time periods.  Mharc provides a link on message pages to
other messages with the same subject.  This does an automatic search
across all periods.  Unfortunately, it is definitely not a usable
as the threading navigation mhonarc does and it does not help if
a thread has messages with different subjects.

The mail-archive.com uses an alternative approach that the maintainer's
have dubbed "poor-man windowing".  In this model, they have a single
archive, but only have mhonarc remember the last X messages.  Older
messages are not in the .mhonarc.db file, but the html files are
not deleted (KEEPONRMM resource).

There are limitations to this scheme:

* Index pages do not exist for older messages.  For mail-archive, this
  has not been a big problem since $TSLICE$ is utilized to still
  allow for thread navigation at the message page level.  Also, for
  older messages, they are normally reached via searching (either
  via Google or via mail-archive's search) vs through an index page.

  Since older messages do not have index pages, links to the index
  page from an old message goes to a page that does not have the
  message listed.

* Archive edit operations are more complicated.  This is minimized
  by utilizing CSS as much as possible (requiring careful planning
  of mhonarc resource settings early on).  A custom script was required
  to edit old html message files that could not be easily modified
  via stylesheet changes.

* If a thread is very long and spans more than X messages, you will
  have a break in the thread on the index pages.  However, via $TSCLICE$,
  a reader could still navigate the complete thread via the message

There have been discussions about over-coming these limitations with
the existing mhonarc code base, but nothing has been implemented yet.


