[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [approved] Function mapping messages to IDs?
On September 9, 2003 at 19:47, Alejandro Forero Cuervo wrote:
> > If the input is from a MH-style folder, mhonarc will actually
> > do a numeric sort on the directory contents first to provide
> > some parallel to filename numbers. However, if MHPATTERN
> > resource is customized, the order the files are processed are
> > somewhat arbitrary.
>
> I'm using
>
> $ mhonarc -rcfile $file -mhpattern '^[^\.]' $maildir/{cur,new}
>
> to generate the archive and the same command with -add to keep it
> up to date.
>
> Would it be too hard to have MHonArc sort the messages by their
> date (rather than by their filename) and use that order when
> assigning them IDs?
It would not be hard, but it adds extra overhead since each file
would have to be stat'ed.
> > BTW, MHonArc is designed where the message number really does
> > not mean much. Why is the message numbering assignment
> > important to you?
>
> Because the URL given to each message depends on the number. If
> the order is somewhat arbitrary one could run into trouble when
> regenerating the archive from different sources as the URLs would
> change and links from other locations would break.
>
> I think it would be best to use some criteria that allows MHonArc
> to always give the same file names to the same messages
> regardless of whether the archive is generated from a MBox or a
> Maildir.
This issue has been brought up before and it is a known problem.
The date sorting method you advocate does not solve the real problem.
Numbering will be modified if you happen to remove at least one of
the original raw messages if doing an archive rebuild.
It is worth noting that your case is somewhat unusual since you changed
storage formats for the raw data, causing them to be processed in a
different order.
> This criteria should also give the messages the same names
> regardless of whether the archive has been constantly updated
> using the ``-add'' argument or is rebuilt from the ground up and
> the only criteria I can think of to make this possible is using
> the date (actually the date of arrival to the archive, hmm).
No, the best method would to use the message-id since message-ids
are unique while you can have multiple messages with the same date
(which can provide a source of numbering inconsistency depending
on how the sorting algorithm works).
Unfortunately, such a change at this time brings up compatibility
issues. It may be possible to make it a configurable option.
Using message-ids for filenames have been on the TODO list for
awhile.
Note, the annotation feature of mhonarc (does anyone even use
it?) actually uses message-id filenames so annotations can be preserved
on rebuilds or shared across multiple archives.
> I recently had to rebuild my HTML archive (as I changed some
> options and also as I converted my archive from Mbox to Maildir)
> and I noticed that the URLs had changed.
Later version of MHonArc have the RECONVERT resource. Although
less efficient than doing a virgin rebuild, it allows you to
effectively rebuild an archive but preserve message numbers.
The mharc system, <http://www.mhonarc.org/mharc/>, deals with
the problem by providing a "Permanent Link" to bookmarking purposes.
It utilizes the underlying search engine to provide persistent
links to messages. You can see an example of this in the mhonarc.org
mail archives.
--ewh
---------------------------------------------------------------------
To sign-off this list, send email to majordomo@mhonarc.org with the
message text UNSUBSCRIBE MHONARC-USERS
[Index of Archives]
[Bugtraq]
[Yosemite News]
[Mhonarc Home]