[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Stripping signature / tagline / adline
>> Is there any standard way to tell Mhonarc to strip the
>> signature / tagline / adline (for free email providers)?
>> In plaintext, they are often deliniated by multiple dashes
>> and a newline.
=v= The RFC standard is "-- \n", which Earl's code addresses,
but not many people seem to use it these days.
>> I think they are delineated in html by a tag - but I'm
>> not sure.
=v= Generally not, since people appending ads to messages
aren't interested in making it easy to detect them. I've
found that they tweak the format now and then, seemingly
at random. If you search for ASCII lines, make sure you've
found the *last* such line in the message, since the message
author might be doing something with lines as well.
=v= Topica is the worst of the email list services in this
regard; they append *and* prepend ads, and they jiggle the
format around from time to time. I've got a Perl filter to
get rid of this junk, but I find I have to change it from
time to time.
=v= Stuff from Hotmail usually has a one-liner appended to
it, and it almost always has an apostrophe. Actually, what
it almost always has is a "Windows 1252" charset "smart
single quote", often turning an ASCII message into one with
exactly one 8bit character. (This isn't always apparent from
the headers, which say "text/plain".)
[Index of Archives]