Re: NNTPC: minor patch & ideas about xover

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Julian Assange wrote:
> 
> Herbert, you are worse than Jaws ;)

<evil grin>

> > 	Another minor annoyance with the the current xover implementation
> > is that xover_input returns whenever it gets '.', and it won't create an
> > xover file in that case.  This can be problematic if people repeatedly
> > request a section of xover records that has been expired and there's
> > no xover file to tell nntpcached that it has gone.  One solution might
> > be to move the file creation mechanism to xover_io.
> 
> Hmm. How often in reality are there holes of over 512 expired articles?
> _in between_ valid articles? It seems awfully inefficient to have
> xover bitmaps created for such a situation.

These gaps are caused by articles that have very long expire times due to
'Expire' headers.  Personally I don't read a lot of newsgroups, so I can't
comment on how prevalent they are.  However, if we do implement the list
of articles below as you suggested, then this will be a non-issue.

Alternatively, we could rely on the assumption that all newsgroups will
be of the form:

	Some discontinuous articles with Expire headers --
	A range (min--max) of articles.

If we assume this, then we could do the map for the range.  However, we'll
need to update the bounds of the range regularly.

PS Can we use 'xhdr Lines range' to save some bandwidth? Or is this not
worth it? Maybe we can write a small program that'll run on the remote
server to give the info to nntpcached?
 
> I too have thought about this one. Rather than any sort of pro-active
> measure my idea was to use the information returned by listgroup and the
> article commands to continually update the list of articles present on
> the remote server. This is a pain though, and will require a complex new
> data structure to handle it - a binary linked list of article ranges for
> each group. I guess we could chain these onto the current _struct
> newsgroup_, but memory coherency is going to be a big issue. Just using
> mmalloc is going to spread the information required for any one group
> all over the swap, more or less randomly. The best idea might be to have
> a persistent mmapped offset (rather than pointer, so it is re-locatable)
> based list in each group directory, although this is going to use at
> least PAGESIZE (typically 4k-16k) bytes per group with articles
> (including cross posted articles). Further, it means that each crosspost()
> is going to have to read in and possibly write out groups*PAGESIZE worth
> of lists.

If we make that assumption, the list will be very small considering that there
are usually <10 articles with Expire headers.

-- 
Debian GNU/Linux 1.1 is out! { http://www.debian.org/ }
Email:  Herbert Xu ~{PmV>HI~} <herbert@greathan.apana.org.au>
{ http://greathan.apana.org.au/~herbert/ }
PGP Key:  pgp-public-keys@pgp.mit.edu or any other key sites


[Index of Archives]     [Yosemite]     [Yosemite Campsites]     [Bugtraq]     [Linux]     [Trn]

Powered by Linux