On 01/29/2013 09:34:00 AM, Paul Gortmaker wrote:
It seems there are about 80 new, but undocumented addtions at
the top level Documentation directory. This fixes up the top
level 00-INDEX by adding new entries and deleting a couple orphans.
Some subdirs could probably still use a check/cleanup too though.
Cc: Rob Landley <rob@xxxxxxxxxxx>
Signed-off-by: Paul Gortmaker <paul.gortmaker@xxxxxxxxxxxxx>
I've got a script that makes html navigation pages from the 00-INDEX
files and another one that parses that to find dead links in both
directions. (Files with no 00-INDEX entry and 00-INDEX entries that
don't refer ot a file.)
I haven't run it in forever because the kernel.org guys took
everybody's accounts away, and they won't give me a new .ssh key
without a blood test or some such, and even if I did jump through the
hoops they made ssh go to a git wrapper you can't rsync through, so I
can't update kernel.org/doc/Documentation anymore. (Files attached
anyway.)
The patch looks good, but it also highlights the fact that this
directory needs a wholesale cleanup. Translations into languages the
developers don't speak and can't audit really don't belong in this
directory (they belong on the web somewhere), but Greg KH says
otherwise. The architecture stuff needs to be collated under an "arch"
directory the same way the source is. Zorro is still a serial driver at
the top level...
Sigh. I have buckets of things I want to do to this directory but no
longer have a kernel account. *shrug*
Acked-by: Rob Landley <rob@xxxxxxxxxxx>
Can you send it through the trivial tree?
Rob
#!/usr/bin/python
import os,sys
# Get a list of files under the Documentation directory,
# filtering out instances of index.html
dirlist = []
for i in os.walk("Documentation"):
for j in i[1]: dirlist.append("%s/%s/" % (i[0], j))
for j in i[2]:
if j!="index.html": dirlist.append("%s/%s" % (i[0], j))
dirlist.sort()
# Function to parse a relative link and append it to a list.
taglist = []
def handletag(path, tag, data):
tag = tag.split()
if tag[0]=="a":
for i in tag:
if i.startswith("href="):
i = i[5:]
if i[0]=='"' and i[-1]=='"': i=i[1:-1]
taglist.append("%s/%s" % (path, i))
# Find all the index.html files under Documentation, read each one,
# iterate through the html tags and call handletag() for each.
for dir in os.walk("Documentation"):
if "index.html" in dir[2]:
data = open("%s/index.html" % dir[0]).read()
data = data.split("<")[1:]
for i in data:
i = i.split(">")
handletag(dir[0], i[0], i[1])
# Display the links with no files, and the files nothing linked to.
print "404 errors:"
for i in filter(lambda a: a not in dirlist, taglist): print i
print "Unlinked documents:"
for i in filter(lambda a: a not in taglist, dirlist): print i
#!/usr/bin/python
# Convert kernel Documentation/.../00-INDEX to index.html
import os,sys
for dir in os.walk("Documentation"):
if not "00-INDEX" in dir[2]: continue
# Read input
lines = open("%s/00-INDEX" % dir[0]).read()
lines = lines.split("00-INDEX",1)
if len(lines)==1:
print "FAILED %s" % dir[0]
continue
# Open output, write header and <pre> section (if any)
out = open("%s/index.html" % dir[0], "w")
out.write("<html>\n<title>%s</title>\n<body>\n<ul>\n" % dir[0])
if lines[0]: out.write("<pre>%s</pre>\n" % lines[0])
lines = lines[1].split("\n")
lines[0] = "00-INDEX"
close = 0
for idx in range(len(lines)):
if not lines[idx]: continue
if not lines[idx][0].isspace():
if close: out.write('</li>\n')
out.write('<li><a href="%s">%s</a>' % (lines[idx].strip(), lines[idx].strip()))
close = 1
else: out.write(" %s" % lines[idx].strip())
out.write("</li>\n</ul>\n</body>\n</html>\n")