[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MHonArc v2.5.4 Released



On May 3, 2002 at 09:06, "Edward Wildgoose" wrote:

> Just an idea with regards to the previous threads.  Would it be
> sensible/efficient/possible to use this to expand the $NOTE$ automatically
> to include the body of each email?

Ah, you caught the motivation behind the additional API callbacks.

> Is this the right way to go about it, or is it best for me to try and work
> this modification into the code elsewhere?

I think you would will need to use multiple API callbacks.  There
is more than one way to do it.  One way would use the
$CBMessageBodyRead, $CBRcVarExpand, and $CBDbSave callbacks.

The $CBMessageBodyRead is used to extract out the first X characters
from the converted message body.  I'd probably strip out any HTML
tags since the data is only to give a brief look at what the message
may be.  The data extracted would be stored in a custom hash, lets
call it %X_MesgPreview where the keys are the message index and the
values are the extracted preview text.

For the $CBRcVarExpand callback, it will check for variable references
called $X-MSG-PREVIEW$ and expand it using the %X_MesgPreview hash.

The $CBDbSave callback can be usd to save the %X_MesgPreview to the
database file so it will be reloaded each time the archive is processed.

Since I wanted to see if all this is actually possible, I wrote
a custom front-end myself.  The program is attached to this message
along with the manpage and a sample resource file to test it out.

BTW, an alternative approach is to never store message preview data
in the database but to always extract it from the message pages.  The
problem with this is it would be very I/O intensive.

--ewh

#!/usr/local/bin/perl
##---------------------------------------------------------------------------##
##  File:
##	$Id: mha-preview,v 1.2 2002/05/03 20:52:43 ehood Exp $
##  Author:
##      Earl Hood       earl@earlhood.com
##  Description:
##      Custom MHonArc-based program that supports $X-MSG-PREVIEW$
##	resource variable using the callback API.
##
##	Invoke program with -man option to see manpage.
##---------------------------------------------------------------------------##
##    Copyright (C) 2002	Earl Hood, earl@earlhood.com
##    This program is free software; you can redistribute it and/or modify
##    it under the same terms as MHonArc itself.
##---------------------------------------------------------------------------##

# Uncomment and modify the following if MHonArc libraries were not
# installed in a perl's site directory or in perl's normal search path.
#use lib qw(/path/to/mhonarc/libraries);

package MHAPreview;

use Getopt::Long;

# Max size of preview text: This is the maximum amount that will be
# saved for each message.  The resource variable length modifier can
# be used to always display less than max, but it is best to avoid
# doing that since it is a slow operation.  We have a custom command-line
# option to set the max size if code change is not desired.
my $PreviewLen = 256;

##-----------------------------------------------------------------------##
##  Main Block
##-----------------------------------------------------------------------##

MAIN: {
    unshift(@INC, 'lib');	# Should I leave this line in?

    ## Grab options from @ARGV unique to this program
    my %opts = ( );
    Getopt::Long::Configure('pass_through');
    GetOptions(\%opts,
      'prv-maxlen=i',
      'help',
      'man'
    );
    usage(1) if $opts{'help'};
    usage(2) if $opts{'man'};

    if ($opts{'prv-maxlen'} && ($opts{'prv-maxlen'} > 0)) {
      $PreviewLen = $opts{'prv-maxlen'};
    }

    ## Reset pass-through of options
    Getopt::Long::Configure('no_pass_through');

    ## Initialize MHonArc
    require 'mhamain.pl' || die qq/ERROR: Unable to require "mhamain.pl"\n/;
    mhonarc::initialize();

    ## Register callbacks for handling preview text
    register_callbacks();

    ## Process input.
    mhonarc::process_input() ? exit(0) : exit($mhonarc::CODE);
}

##-----------------------------------------------------------------------##
##  Callback Functions
##-----------------------------------------------------------------------##

sub register_callbacks {
  $mhonarc::CBMessageBodyRead = \&msg_body_read;
  $mhonarc::CBRcVarExpand     = \&rc_var_expand;
  $mhonarc::CBDbSave	      = \&db_save;
}

sub msg_body_read {
  my($fields, $html, $files) = @_;
  my $mha_index = $fields->{'x-mha-index'};
  my $preview = extract_preview($html, $PreviewLen);
  $X_MessagePreview{$mha_index} = $preview;
}


sub rc_var_expand {
  my($mha_index, $var_name, $arg_str) = @_;

  # $X-MSG-PREVIEW(mesg_spec)$
  if ($var_name eq 'X-MSG-PREVIEW') {
    # Use MHonArc function to support a mesg_spec argument
    my ($lref, $key, $pos, $opt) =
	mhonarc::compute_msg_pos($mha_index, $var_name, $arg_str);
    return ($X_MessagePreview{$key}, 0, 1);
  }

  # If we do not recognize $var_name, make sure to tell
  # MHonArc we do not so it will try.
  (undef, 0, 0);
}


sub db_save {
  my($db_fh) = @_;
  # Make sure variable is package qualified!
  mhonarc::print_var($db_fh, 'MHAPreview::X_MessagePreview',
		     \%X_MessagePreview);
}

##-----------------------------------------------------------------------##
##  Support Functions
##-----------------------------------------------------------------------##

sub extract_preview {
  # Extracting the preview text of the message body is not as
  # trivial as you may expect.  We have to deal with HTML tags
  # and entity references, but want to avoid the overhead of doing
  # using a full-blown HTML parser.

  my $html     = shift;	# reference to HTML message body
  my $prev_len = shift;	# length of preview to extract

  my $text = "";
  my $html_len = length($$html);
  my($pos, $sublen, $erlen, $real_len);
  
  for ( $pos=0, $sublen=$prev_len; $pos < $html_len; ) {
    $text .= substr($$html, $pos, $sublen);
    $pos += $sublen;

    # strip tags
    $text =~ s/\A[^<]*>//; # clipped tag
    $text =~ s/<[^>]*>//g;
    $text =~ s/<[^>]*\Z//; # clipped tag

    # check for clipped entity reference
    while (($pos < $html_len) && ($text =~ s/\&[^;]*\Z//)) {
      $text .= substr($$html, $pos, 1);
      ++$pos;
    }

    # minimize whitespace
    $text =~ s/\s+/ /g;

    # compute entity reference lengths to determine "real" character
    # count and not raw character count.
    $er_len = 0;
    while ($text =~ /(\&[^;]+);/g) {
      $er_len += length($1);
    }

    # done if we have enough
    $real_len = length($text)-$er_len;
    if ($real_len >= $prev_len) {
      if ($real_len < $html_len) {
	$text .= '...';
      }
      last;
    }

    $sublen = $prev_len - (length($text)-$er_len);
  }

  $text;
}

sub usage {
  require Pod::Usage;
  my $verbose = shift;
  if ($verbose == 0) {
    Pod::Usage::pod2usage(-verbose => $verbose);
  } else {
    my $pager = $ENV{'PAGER'} || 'more';
    local(*PAGER);
    my $fh = (-t STDOUT && open(PAGER, "|$pager")) ? \*PAGER : \*STDOUT;
    Pod::Usage::pod2usage(-verbose => $verbose,
                          -output  => $fh);
  }
  exit 0;
}

##-----------------------------------------------------------------------##
__END__

=head1 NAME

mha-preview - MHonArc front-end to support message preview variable

=head1 SYNOPSIS

S<B<mha-preview> [I<options>] [I<arguments>]>

=head1 DESCRIPTION

B<mha-preview> is an example program the utilizes MHonArc's callback
API to support the special resource variable C<$X-MSG-PREVIEW$>.
The C<$X-MSG-PREVIEW$> represents the initial text of a message body.
With this variable, index pages can contain be customized to give
a listing like some MUAs that provide a glimpse of the message body
in the mail listing of a mail folder.

When extracting the preview text of the message body, all HTML tags
are removed and whitespace is compressed.

B<CAUTION>: If B<mha-preview> is used for an archive, it should
always be used to process the archive.  Otherwise, the message preview
data will be lost.

=head1 OPTIONS

B<mha-preview> takes the same options available to B<mhonarc> along
with the following additional options:

=over

=item C<-help>

Print a usage summary of this program (this option overrides
B<mhonarc>'s C<-help> option).

=item C<-man>

Print the manpage for this program.

=item C<-prv-maxlen>

Maximum amount of characters of the message body to store for each
message.  The default value is 256.

=back

=head1 NOTES

=over

=item *

The functionality of this program could be placed into the
C<mhasiteinit.pl> library to avoid the need for this program and
to make it part of the locally installed B<mhonarc>.  This would
avoid the problem noted in the CAUTION mentioned
in the L<DESCRIPTION section|"DESCRIPTION">.

=item *

The body preview resource variable may be worth putting into the
MHonArc code base directly.

=back

=head1 SEE ALSO

mhonarc(1)

=head1 LICENSE

B<mha-preview> comes with ABSOLUTELY NO WARRANTY and can be distributed
under the same terms as MHonArc itself.

=head1 AUTHOR

Earl Hood, earl@earlhood.com

=cut

NAME
    mha-preview - MHonArc front-end to support message preview variable

SYNOPSIS
    mha-preview [*options*] [*arguments*]

DESCRIPTION
    mha-preview is an example program the utilizes MHonArc's callback API to
    support the special resource variable "$X-MSG-PREVIEW$". The
    "$X-MSG-PREVIEW$" represents the initial text of a message body. With
    this variable, index pages can contain be customized to give a listing
    like some MUAs that provide a glimpse of the message body in the mail
    listing of a mail folder.

    When extracting the preview text of the message body, all HTML tags are
    removed and whitespace is compressed.

    CAUTION: If mha-preview is used for an archive, it should always be used
    to process the archive. Otherwise, the message preview data will be
    lost.

OPTIONS
    mha-preview takes the same options available to mhonarc along with the
    following additional options:

    "-help"
        Print a usage summary of this program (this option overrides
        mhonarc's "-help" option).

    "-man"
        Print the manpage for this program.

    "-prv-maxlen"
        Maximum amount of characters of the message body to store for each
        message. The default value is 256.

NOTES
    *   The functionality of this program could be placed into the
        "mhasiteinit.pl" library to avoid the need for this program and to
        make it part of the locally installed mhonarc. This would avoid the
        problem noted in the CAUTION mentioned in the DESCRIPTION section.

    *   The body preview resource variable may be worth putting into the
        MHonArc code base directly.

SEE ALSO
    mhonarc(1)

LICENSE
    mha-preview comes with ABSOLUTELY NO WARRANTY and can be distributed
    under the same terms as MHonArc itself.

AUTHOR
    Earl Hood, earl@earlhood.com

<!-- MHonArc Resource File -->
<!-- $Id$
     Earl Hood <earl@earlhood.com>
  -->
<!--	Specify date sorting.
  -->
<Sort>

<!--	Set USELOCALTIME since local date formats are used when displaying
	dates.
  -->
<UseLocalTime>

<!--    Define message local date format to print day of the week, month,
	month day, and year.  Format used for day group heading.
  -->
<MsgLocalDateFmt>
%B %d, %Y
</MsgLocalDateFmt>

<ListBegin>
<ul>
<li><a href="$TIDXFNAME$">Thread Index</a></li>
</ul>
<hr>
<dl>
</ListBegin>

<!--	DAYBEGIN defines the markup to be printed when a new day group
	is started.
  -->
<DayBegin>
<dt><strong>$MSGLOCALDATE$</strong></dt>
</DayBegin>

<!--	DAYBEND defines the markup to be printed when a day group ends.
  -->
<DayEnd>

</DayEnd>

<!--	Define LITEMPLATE to display the time of day the message was
	sent, message subject, author, and any annotation for the
	message.
  -->
<LiTemplate>
<dd>
<b>$SUBJECT$</b>, <i>$FROMNAME$</i>, $MSGLOCALDATE(CUR;%H:%M)$<br>
<dl><dd><small><i>$X-MSG-PREVIEW$</i></small></dd></dl>
</dd>
</LiTemplate>

<!--	Define LISTEND to close list.
  -->
<ListEnd>
</dl>
</ListEnd>

[Index of Archives]     [Bugtraq]     [Yosemite News]     [Mhonarc Home]