A Filter to Speed up nmh Message Scans

blinux-list@redhat.com (Martin McCormick) · Mon, 27 May 2002 12:01:55 -0500



	I just wrote a little C program that I use on large
message folders to speed up the process of scanning large numbers
of messages.  You have to set up a format file to only pass the
message number plus the subject.  My filter ignores the message
number because it always changes, but if there are more than two
messages with the same subject, you only hear the first scan.
It silently skips all the rest of the lines with the same subject
and then wakes up when the subject changes.

	I have everything you need to make it run in a uuencoded
file that decodes to form a file called subjects.tar.gz.  When
you uudecode it, it unpacks to a directory called subjects.  In
there is a file called doc.txt and the source called subjects.c .

	I tell you how to build it and what its limitations are .

	The main thing that throws it off is when several people
post subjects that are essentially the same subject, but have
been re-spelled or otherwise reworked.  My filter does a few
tricks to get around common variations, but it is not
sophisticated at all.  All I do is to force all words in the
subject to upper case, remove all whitespace and punctuations.
That still isn't enough, but anything else gets in to the realm
of very complex.  This is quick and dirty.

	The uuencoded file is 60 lines long and I could post it
to the list, but that's not fair to those who don't care.

Martin McCormick WB5AGZ  Stillwater, OK 
OSU Center for Computing and Information Services Network Operations Group