Re: Man page website

Pete Travis <me@xxxxxxxxxxxxxx> · Tue, 31 Jul 2012 08:17:45 -0600

On Jul 31, 2012 2:09 AM, "Adam Pribyl" <pribyl@lowlevel.cz> wrote:

>

> On Mon, 30 Jul 2012, Pete Travis wrote:

>

>> On Mon, Jul 30, 2012 at 10:39 PM, Ankur Sinha <sanjay.ankur@gmail.com>wrote:

>>

>>> On Tue, 2012-07-31 at 12:18 +0800, Christopher Meng wrote:

>>>>

>>>> I think we can get as many as possible.

>>>>

>>>> But it's really not an easy job.

>>>

>>>

>>> I've just been looking at the man2html manual. Apparently, you don't

>>> need to convert the man pages to html. The manual says:

>>>

>>>

>>>> This can be used as a stand-alone utility, but is mainly intended as an

>>>

>>> auxiliary, to enable users to browse their man pages using a html browser

>>> like lynx(1), xmosaic(1)  or  net‐scape(1).

>>>

>>> The manual then mentions quite a lot of stuff about CGI(something I know

>>> ZERO about). I suggest we involve infra in the discussion? I know

>>> bugzilla uses CGI, and infra would probably know how to use man2html?

>>>

>>>

>>> One requirement seems to be that we have *all* man pages from *all* out

>>> packages installed on the host so that man2html can run on it. I'm not

>>> sure if we have a host that has all packages installed. Again, infra

>>> would probably be able to shed more light on these details.

>>>

>>> And, uh, I would like to help out on this ;)

>>

>>

>>

>> I had a brief conversation with the guys in #fedora-admin about the idea

>> this afternoon. It doesn't sound like there is a single resource that

>> currently has every .rpm or src.rpm on it for us to play with for a new

>> app, and they understandably weren't enthusiastic about enabling that kind

>> of kludge. For every package, we'd have to find out if there was an update,

>> pull the updated rpm/src.rpm, extract the manpages, parse over them, and

>> push to the web app.  TThey advised that the packages app looks like the

>> best bet for accomplishing this in a sustainable manner, since it's digging

>> through each rpm anyway.  That doesn't gain us much besides affirmation,

>> but it does give us somewhere to focus our efforts.

>

>

> It's definitely not necessary to install every RPM - you may find man pages in RPM without installing it, by just unpacking rpm one by one in e.g. tmp. Getting a script that goes thru all RPMs in "everything" repository of latest fedora release once a month would do the trick. Still there will be a work on a "thing" that converts man2html, as I'd bet there will be errors in man html pages caused by this conversion.

>

> Adam Pribyl

> --

> docs mailing list

> docs@lists.fedoraproject.org

> To unsubscribe:

> https://admin.fedoraproject.org/mailman/listinfo/docs

Adam, 
I think you misunderstood my comments. It isn't possible to actually install *every* RPM, and I didn't imply that we should try.  While the idea of grabbing the relevant files and unpacking them in /tmp is simple, doesn't scale well.  Your monthly cron job might take all month to run!  Downloading tens of thousands of packages and shuffling their contents through scripts is the "kludge"  I was referring to. 

I'm not qualified for digging around in the packages app, but I'm glad to see there might be some interest.   The formatting surely needs some attention. I just want to see the code that does the formatting paired with the code that does the extracting in a way that doesn't consume a large amount of resources and can be kept updated as the updates are pushed - without human intervention. 

For reference, Debian's method is described here: http://lists.debian.org/debian-services-admin/2011/08/msg00003.html

I think we can come up with something more elegant and efficient. 

--pete
-- 
docs mailing list
docs@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/docs