>> From: Jari Arkko <jari.arkko@xxxxxxxxx> >> Subject: Re: query about ID/RFC statistics >> Date: 27 May 2015 01:44:44 GMT+3 >> To: Christopher Morrow <morrowc.lists@xxxxxxxxx> >> Cc: Michael Richardson <mcr+ietf@xxxxxxxxxxxx>, ietf <ietf@xxxxxxxx> >> >> >>> aren't these things listed in the XML and this a 'quick' xml xoath >>> parse away from win? >> >> Yes, they are in some cases… the difficulty with getauthors has been the >> special cases. The non-XML… the non-numbered sections… the oddly >> intended stuff… the misspelled stuff… people’s names in different >> formats… the people who misspell their names :-) or at least their >> co-authors names :-) >> >> You could of course say that we should ignore all that broken stuff. >> I wanted to have a smaller error rate, hence included many special >> cases. Almost all of this is data-driven, so once you add a pattern >> line the tool will recognise it in the future. >> >> Anyway, I do have a set of tools that i really have no time to maintain. >> One group of tools is getauthors/authorstats, the one that collects >> document statistics. It is operational, but if taken over by anyone >> else it needs a rewrite. Despite being somewhat data driven, the >> rest of the code is a hack upon a hack. >> >> Another group of tools is the IESG statistics tools, which would >> be very interesting, but are no longer operational due to interface >> changes to how the data tracker presents itself. It too would need >> a lot of work. >> >> If anyone is interested in putting time on these tools, let us know! >> For instance, we could start with a Code Sprint project in Prague. >> The IETF also runs official tools projects that are funded with >> IETF funds. So far I have not considered these tools so business >> critical that we’d need to have them done commercially. And some >> of them have been up and running through community effort, >> i.e., me, Lars, and a few people who have sent me edits. >> Let me know if these are so critical that they’d need a more >> official IETF attention. Is the source code for your pages already publicly available? Russ