Re: (shrinking saved news): which header-lines to KEEP?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Generally the headers you'd want to preserve depend on the level of
facilities you'd want to have in the stripped folder.

If the threads must be keeped, then Message-ID, In-Reply-To and References
have to remain in the message. Message-ID is also an excellent resource if
you want to refer somebody else to this message by a short link to its
Google repository instead of forwarding the message to him/her.

Second, if your messages contain MIME attachments you'd like to view with
helper applications, MIME-Version and Content-Type also should be kept.

Please see attached my sed script I use when saving an article from a
newsgroup. A similar script is used for mailing lists, too.

HTH
-- 
Regards,						|       /^^^\
	Yury						|     (| , , |)
							|      |  *  |
E-mail: yury.burkatovsky at telrad dot co dot il	|       \_-_/

On Tue, 6 Aug 2002, David Combs wrote:

> I like to save chosen messages from various groups
> I'm on.  This stuff piles up!
>
> So, I'd like to remove most of the header-lines,
> to save disk-space, also to make paging
> through the messages a lot more pleasant.
>
> Now, what kind of things would *you* want
> to keep, were you doing this?
>
> Of course, subject, date, replyto, from:,
> ...
>
> What about Message-ID?  What's it good for?
>     (it's local to the isp?  or globally valid and usesful?)
>
>
> References:?  (for rebuilding threads?)
# 980816
# Purpose:  Weeds (deletes) unwanted header lines from "news folders"
#   (text files containing emails)
# Installation:  Save this script as "weednews.sed".
# Usage:  sed -f weednews.sed folder > folder.weeded
#

b cont
# wipe out all the lines that comprise one header (they begin from white space)
:more
  /^[ 	]/{
    N
    s/^.*\n//
    t more
  }

:cont
/^Approved.*:/d
/^Comments:/d
/^Distribution:/d
/^Errors-[Tt]o:/d
/^Expires:/d
# /^In-[Rr]eply-[Tt]o:/{ 
#   N                    
#   s/^.*\n//            
#   t more               
# }                      
/^Lines:/d
# /^Message-[Ii][Dd]:/{
#   N
#   s/^.*\n//
#   t more
# }
/^N[Nn][Tt][Pp]-Posting.*:/d
/^N[Nn][Tt][Pp]-Posting.*:/d
/^N[Nn][Tt][Pp]-Posting.*:/d
/^Organization:/d
/^Originator:/d
/^Path:/d
/^Phone:/d
/^Post:/d
/^Precedence:/d
# /^References:/{
#   N
#   s/^.*\n//
#   t more
# }
/^Return-Receipt-[Tt]o:/d
/^Reply-To:/d
/^Sender:/{
  N
  s/^.*\n//
  t more
}
#/^Status:/d
/^Super[cs]edes:/d
#/^X-[a-zA-Z-]*:/d
/^X-Accept-Language:/d
/^X-Authentication-:/{
  N
  s/^.*\n//
  t more
}
#/^X-Bulletin:/d
/^X-[A-E].*:/d
/^X-Face:/{
  N
  s/^.*\n//
  t more
}
/^X-[H-K].*:/d
/^X-Listprocessor-Version:/d
/^X-Location:/d
/^X-Mailer:/d
/^X-M[Ss]/{
  N
  s/^.*\n//
  t more
}
/^X-MyDeja-Info:/d
/^X-[NO].*:/d
/^X-Server-Date:/d
#/^X-Status:/d
/^X-Sun-Charset:/d
/^X-URL:/d
/^X400/d
/^X-Sender:/d
/^X-M[Ii][Mm][Ee].*:/d
/^X-.*Priority:/d
/^X-T.*:/d
/^X-U[Rr][Ll].*:/d
/^X-Vms-To:/d
/^X-[W-Z].*:/d
/^[Xx]ref:/d
# End of script

[Index of Archives]     [Photo]     [Yosemite]     [Epson Inkjet]     [Mhonarc]     [Nntpcache]

  Powered by Linux