Re: [PATCH] Documentation: update top level 00-INDEX file with new additions

Rob Landley <rob@xxxxxxxxxxx> · Wed, 06 Feb 2013 23:28:44 -0600

On 01/29/2013 09:34:00 AM, Paul Gortmaker wrote:
It seems there are about 80 new, but undocumented addtions at
the top level Documentation directory.  This fixes up the top
level 00-INDEX by adding new entries and deleting a couple orphans.
Some subdirs could probably still use a check/cleanup too though.

Cc: Rob Landley <rob@xxxxxxxxxxx>
Signed-off-by: Paul Gortmaker <paul.gortmaker@xxxxxxxxxxxxx>

I've got a script that makes html navigation pages from the 00-INDEX  
files and another one that parses that to find dead links in both  
directions. (Files with no 00-INDEX entry and 00-INDEX entries that  
don't refer ot a file.)

I haven't run it in forever because the kernel.org guys took  
everybody's accounts away, and they won't give me a new .ssh key  
without a blood test or some such, and even if I did jump through the  
hoops they made ssh go to a git wrapper you can't rsync through, so I  
can't update kernel.org/doc/Documentation anymore. (Files attached  
anyway.)

The patch looks good, but it also highlights the fact that this  
directory needs a wholesale cleanup. Translations into languages the  
developers don't speak and can't audit really don't belong in this  
directory (they belong on the web somewhere), but Greg KH says  
otherwise. The architecture stuff needs to be collated under an "arch"  
directory the same way the source is. Zorro is still a serial driver at  
the top level...

Sigh. I have buckets of things I want to do to this directory but no  
longer have a kernel account. *shrug*

Acked-by: Rob Landley <rob@xxxxxxxxxxx>

Can you send it through the trivial tree?

Rob
#!/usr/bin/python

import os,sys

# Get a list of files under the Documentation directory,
# filtering out instances of index.html

dirlist = []
for i in os.walk("Documentation"):
  for j in i[1]: dirlist.append("%s/%s/" % (i[0], j))
  for j in i[2]:
    if j!="index.html": dirlist.append("%s/%s" % (i[0], j))
dirlist.sort()

# Function to parse a relative link and append it to a list.
taglist = []
def handletag(path, tag, data):
  tag = tag.split()
  if tag[0]=="a":
    for i in tag:
      if i.startswith("href="):
        i = i[5:]
        if i[0]=='"' and i[-1]=='"': i=i[1:-1]
        taglist.append("%s/%s" % (path, i))

# Find all the index.html files under Documentation, read each one,
# iterate through the html tags and call handletag() for each.

for dir in os.walk("Documentation"):
  if "index.html" in dir[2]:
    data = open("%s/index.html" % dir[0]).read()
    data = data.split("<")[1:]
    for i in data:
      i = i.split(">")
      handletag(dir[0], i[0], i[1])

# Display the links with no files, and the files nothing linked to.
print "404 errors:"
for i in filter(lambda a: a not in dirlist, taglist): print i
print "Unlinked documents:"
for i in filter(lambda a: a not in taglist, dirlist): print i
#!/usr/bin/python

# Convert kernel Documentation/.../00-INDEX to index.html

import os,sys

for dir in os.walk("Documentation"):
  if not "00-INDEX" in dir[2]: continue

  # Read input

  lines = open("%s/00-INDEX" % dir[0]).read()

  lines = lines.split("00-INDEX",1)
  if len(lines)==1:
    print "FAILED %s" % dir[0]
    continue

  # Open output, write header and <pre> section (if any)
  out = open("%s/index.html" % dir[0], "w")
  out.write("<html>\n<title>%s</title>\n<body>\n<ul>\n" % dir[0])
  if lines[0]: out.write("<pre>%s</pre>\n" % lines[0])
  lines = lines[1].split("\n")
  lines[0] = "00-INDEX"

  close = 0
  for idx in range(len(lines)):
    if not lines[idx]: continue
    if not lines[idx][0].isspace():
      if close: out.write('</li>\n')
      out.write('<li><a href="%s">%s</a>' % (lines[idx].strip(), lines[idx].strip()))
      close = 1
    else: out.write(" %s" % lines[idx].strip())
  out.write("</li>\n</ul>\n</body>\n</html>\n")