Re: [PATCH 3/4] scripts/kernel-doc: Track line-number of starting blocks

Daniel Vetter <daniel.vetter@xxxxxxxx> · Thu, 2 Jun 2016 11:42:06 +0200

On Thu, Jun 2, 2016 at 9:51 AM, Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> wrote:
> On Thu, 02 Jun 2016, Daniel Vetter <daniel.vetter@xxxxxxxx> wrote:
>> Design is pretty simple: kernel-doc inserts breadcrumbs with line
>> numbers, and sphinx picks them up. At first I went with a sphinx
>> comment, but inserting those at random places seriously upsets the
>> parser, and must be filtered. Hence why this version now uses "#define
>> LINEO " since one of these ever escape into output it's pretty clear
>> there is a bug.
>>
>> It seems to work well, and at least the 2-3 errors where sphinx
>> complained about something that was not correct in kernel-doc text the
>> line numbers matched up perfectly.
>
> Very nice indeed!
>
>> v2: Instead of noodling around in the parser state machine, create
>> a ViewList and parse it ourselves. This seems to be the recommended
>> way, per Jani's suggestion. Unfortunately it does not fix the WARNING
>> gallore this patch somehow produces.
>
> What WARNINGs do you see? I don't think I'm seeing anything new from
> this patch.
>
> There are occasional problems with document re-generation when you've
> changed either kernel-doc and/or kernel-doc.py. Sphinx doesn't know the
> output depends on one or the other. For now I've just run 'make
> cleandocs' every time I've changed the scripts or the Makefile;
> otherwise the dependencies on source files and everything seems to be
> working nicely.

It was late and I was tired. That last sentence isn't valid any more,
I think I fixed all the issues by removing the LINENO lines before
inserting into the sphinx tree.

>> Cc: Jani Nikula <jani.nikula@xxxxxxxxx>
>> Cc: linux-doc@xxxxxxxxxxxxxxx
>> Cc: Jonathan Corbet <corbet@xxxxxxx>
>> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxx>
>> ---
>>  Documentation/sphinx/kernel-doc.py | 24 ++++++++++++++++++++++--
>>  scripts/kernel-doc                 | 18 ++++++++++++++++++
>>  2 files changed, 40 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/sphinx/kernel-doc.py b/Documentation/sphinx/kernel-doc.py
>> index 9fc2c27916a9..1ae4d927fa9d 100644
>> --- a/Documentation/sphinx/kernel-doc.py
>> +++ b/Documentation/sphinx/kernel-doc.py
>> @@ -30,10 +30,13 @@
>>  import os
>>  import subprocess
>>  import sys
>> +import re
>>
>>  from docutils import nodes, statemachine
>> +from docutils.statemachine import ViewList
>>  from docutils.parsers.rst import directives
>>  from sphinx.util.compat import Directive
>> +from sphinx.util.nodes import nested_parse_with_titles
>
> We already import nodes, so I think using nodes.nested_parse_with_titles
> is nicer.
>
>>
>>  class KernelDocDirective(Directive):
>>      """Extract kernel-doc comments from the specified file"""
>> @@ -92,8 +95,25 @@ class KernelDocDirective(Directive):
>>                  sys.stderr.write(err)
>>
>>              lines = statemachine.string2lines(out, tab_width, convert_whitespace=True)
>> -            self.state_machine.insert_input(lines, source)
>> -            return []
>> +            result = ViewList()
>> +
>> +            lineoffset = 0;
>> +            for line in lines:
>> +                match = re.search("#define LINENO ([0-9]*)", line)
>
> I wonder if it would be faster to check for line[0] == '#' or
> line.startswith("#define LINENO ") first before going for regexp
> matching on every line.
>
> Anyway you should match [0-9]+ to require at least one digit, and
> surround the regexp with ^ and $.
>
>> +                if match:
>> +                    # sphinx counts lines from 0
>> +                    lineoffset = int(match.group(1)) - 1
>> +                    # we must eat our comments since the upset the markup
>> +                else:
>> +                    result.append(line, source, lineoffset)
>> +                    lineoffset += 1
>> +
>> +            node = nodes.section()
>> +            node.document = self.state.document
>> +            nested_parse_with_titles(self.state, result, node)
>
> I think we might like to use self.state.nested_parse(result, 0, node)
> instead of nested_parse_with_titles(self.state, result, node). We should
> probably prohibit use of titles in the kernel-doc comments, as those
> will mess up the document structure in the document including the
> kernel-doc.

I tried that, but at least yesterday my python fu was too shallow to
be able to pull that off. If you have a fixup diff for this one and
the import one, would be awesome.

I'll do all the other changes you recommended meanwhile.
-Daniel

>
>> +
>> +            return node.children
>> +
>
> I'd like the switch from using self.state_machine.insert_input() to
> creating ViewList() and parsing that to be a separate prep change. Small
> steps so we can bisect easier.
>
> The kernel-doc changes seem sensible, but I'd like to have the line
> number printouts be behind an option; it's no longer reStructuredText
> with that. We could split this patch to:
>
> 1. switch from insert_input() to ViewList and return nodes
> 2. add lineno output support to kernel-doc, behind an option
> 3. pass the option to kernel-doc from kernel-doc.py, and start handling
>    the line numbers in kernel-doc output
>
>
> BR,
> Jani.
>
>
>>          except Exception as e:
>>              env.app.warn('kernel-doc \'%s\' processing failed with: %s' %
>>                           (" ".join(cmd), str(e)))
>> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
>> index 4da6f952d18b..b29c2a576a8c 100755
>> --- a/scripts/kernel-doc
>> +++ b/scripts/kernel-doc
>> @@ -411,13 +411,16 @@ my $doc_inline_end = '^\s*\*/\s*$';
>>  my $export_symbol = '^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*;';
>>
>>  my %parameterdescs;
>> +my %parameterdesc_start_lines;
>>  my @parameterlist;
>>  my %sections;
>>  my @sectionlist;
>> +my %section_start_lines;
>>  my $sectcheck;
>>  my $struct_actual;
>>
>>  my $contents = "";
>> +my $new_start_line = 0;
>>
>>  # the canonical section names. see also $doc_sect above.
>>  my $section_default = "Description"; # default section
>> @@ -516,11 +519,15 @@ sub dump_section {
>>       $name = $1;
>>       $parameterdescs{$name} = $contents;
>>       $sectcheck = $sectcheck . $name . " ";
>> +        $parameterdesc_start_lines{$name} = $new_start_line;
>> +        $new_start_line = 0;
>>      } elsif ($name eq "@\.\.\.") {
>>  #    print STDERR "parameter def '...' = '$contents'\n";
>>       $name = "...";
>>       $parameterdescs{$name} = $contents;
>>       $sectcheck = $sectcheck . $name . " ";
>> +        $parameterdesc_start_lines{$name} = $new_start_line;
>> +        $new_start_line = 0;
>>      } else {
>>  #    print STDERR "other section '$name' = '$contents'\n";
>>       if (defined($sections{$name}) && ($sections{$name} ne "")) {
>> @@ -530,6 +537,8 @@ sub dump_section {
>>       } else {
>>           $sections{$name} = $contents;
>>           push @sectionlist, $name;
>> +            $section_start_lines{$name} = $new_start_line;
>> +            $new_start_line = 0;
>>       }
>>      }
>>  }
>> @@ -1775,6 +1784,7 @@ sub output_blockhead_rst(%) {
>>       if ($output_selection != OUTPUT_INCLUDE) {
>>           print "**$section**\n\n";
>>       }
>> +        print "#define LINENO " . $section_start_lines{$section} . "\n";
>>       output_highlight_rst($args{'sections'}{$section});
>>       print "\n";
>>      }
>> @@ -1861,6 +1871,7 @@ sub output_section_rst(%) {
>>
>>      foreach $section (@{$args{'sectionlist'}}) {
>>       print "**$section**\n\n";
>> +        print "#define LINENO " . $section_start_lines{$section} . "\n";
>>       output_highlight_rst($args{'sections'}{$section});
>>       print "\n";
>>      }
>> @@ -1958,6 +1969,7 @@ sub output_struct_rst(%) {
>>
>>       ($args{'parameterdescs'}{$parameter_name} ne $undescribed) || next;
>>       $type = $args{'parametertypes'}{$parameter};
>> +        print "#define LINENO " . ($parameterdesc_start_lines{$parameter_name} - 1). "\n";
>>       print "``$type $parameter``\n";
>>       output_highlight_rst($args{'parameterdescs'}{$parameter_name});
>>       print "\n";
>> @@ -2745,11 +2757,14 @@ sub process_file($) {
>>           if (/$doc_start/o) {
>>               $state = STATE_NAME;    # next line is always the function name
>>               $in_doc_sect = 0;
>> +                print "#define LINENO " . $. . "\n";
>>           }
>>       } elsif ($state == STATE_NAME) {# this line is the function name (always)
>>           if (/$doc_block/o) {
>>               $state = STATE_DOCBLOCK;
>>               $contents = "";
>> +                $new_start_line = $. + 1;
>> +
>>               if ( $1 eq "" ) {
>>                       $section = $section_intro;
>>               } else {
>> @@ -2833,6 +2848,7 @@ sub process_file($) {
>>               $in_doc_sect = 1;
>>               $in_purpose = 0;
>>               $contents = $newcontents;
>> +                $new_start_line = $.;
>>               while ((substr($contents, 0, 1) eq " ") ||
>>                      substr($contents, 0, 1) eq "\t") {
>>                   $contents = substr($contents, 1);
>> @@ -2866,6 +2882,7 @@ sub process_file($) {
>>                       dump_section($file, $section, xml_escape($contents));
>>                       $section = $section_default;
>>                       $contents = "";
>> +                        $new_start_line = $.;
>>                   } else {
>>                       $contents .= "\n";
>>                   }
>> @@ -2900,6 +2917,7 @@ sub process_file($) {
>>           if ($inline_doc_state == STATE_INLINE_NAME && /$doc_inline_sect/o) {
>>               $section = $1;
>>               $contents = $2;
>> +                $new_start_line = $. + 1;
>>               if ($contents ne "") {
>>                   while ((substr($contents, 0, 1) eq " ") ||
>>                          substr($contents, 0, 1) eq "\t") {
>
> --
> Jani Nikula, Intel Open Source Technology Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html