Re: NNTPC: performance enhancements and other questions

proff@suburbia.net · Sat, 7 Jun 1997 21:27:48 +1000 (EST)

> 
> First, I just upgraded from 1.0.3 to 1.0.6.5.  I ended up having to delete
> the cache.mmap in order for the program to work.  Otherwise the GROUP
> command kept causing a SEGV in nntpcached.  Still testing to see how it
> works.   Is this normal/expected?  I didn't see mention in any
> documentation.  Btw, system is a Linux system with recent vintage kernel
> and libraries.  I can boot in several configurations, and problem show  up
> in them all.

No, it isn't. 1.0.6.5 hasn't been heavily tested - that is why it has a .5
tacked onto the end and there was no announcement. That said, it has only
one (small) new feature, and a few minor bug fixes - although some of those
bug-fixes may themselves have introduced bugs. I'd be interested to see
if you find the same problems with 1.0.6

> Calling setproctitle (something I abhor with a passion anyway), before
> getopt in main causes a segfault.  Simply moving it down after getopt fixes
> the problem (again, under Linux).

This makes sense, given the way linux setproctitle handles argv.

> Now, for the big thing.  Performance.
> 
> Take, for instance, the group misc.jobs.offered. 
> 
> On netcom, there are currently 45,000 posts in that group.  Now, I go and
> fire up trn to read that group, and even through nntpcache, well, lets say
> I've not yet been able to read the group.  
> 
> Now, nntpcached is pretty damned good for the smaller groups that I read
> (ie, less than 3000 articles per group).  But once you get around 45000
> articles, you start filling that directory up with a hell of a lot of files
> (*_head and *_xover).  And, well, most file systems in the unix world just
> aren't logarithmic when it comes to directory lookups.  As a result, it
> takes forever to test for each _head file, and to create it once you get it
> from the remote server.  _xover files are a bit better, due to the
> consolidation of 512 articles per file.  

What kind of insane news-reader insists on using HEAD when XOVER is
available?

> So, what I suggest taking a hint from INN and get rid of direct support for
> HEAD.  
>
> XOVER and HEAD have the same information.  Why duplicate it with both _head
> and _xover files?   

Because they don't have the same information. HEAD is very much a
super-set of XOVER.

> Instead, use only _xover files, and whenever a HEAD request is made,
> regenerate the header from the xover information.  This is what INN does.

<shrug> It's completely broken behavior. INN shouldn't do it.

> Also, always provide xover information, even if the remote server doesn't
> provide it.   Take header information from the remote server and generate
> xover information from it.  

Yes, this could be done. But I don't see a call for the added complexity.
No server worth its salt doesn't support XOVER, and no client worth its
salt doesn't support fall-back to HEAD.

> Also, it seems to me that the XHDR commands could be handled by using the
> XOVER database rather than hitting the server.  Assuming, of course that
> the xover information that the server provides is sufficient to meet the
> request of the xhdr command.  The valid lists of required xover fields and
> those of rfc1036 should be identical.

XHDR's are handled this way. Where the requested header is not in the
overview.fmt description for the specified server, nntpcache falls back
to using XHDR.

Cheers,
Julian.

--
Prof. Julian Assange  |If you want to build a ship, don't drum up people
		      |together to collect wood and don't assign them tasks
proff@iq.org          |and work, but rather teach them to long for the endless
proff@gnu.ai.mit.edu  |immensity of the sea. -- Antoine de Saint Exupery