On Fri, 22 Oct 2010 07:02, Thomas Rast wrote: > Add an autogeneration script that inserts links to other manpages > describing config variables, as follows: > > * parse config-vars-src.txt (as it now needs to be renamed) to find > out its current contents > > * parse each manpage source (following includes) for config variable > headers (the blocks are ignored) > > * assemble a new config-vars.txt that completes the original list with > "See linkgit:git-foo[1]" entries for all variables that were not in > it. So is this script about automatically managing links from git-config manpage to manpages of individual commands? Does it mean that for something like the following in Documentation/config-vars-src.txt foo.bar:: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. and if `foo.bar` is referenced in Documentation/git-foo.txt in some way (from reading source of mentioned autogeneration script it considers only 'foo.bar::', like in Documentation/git-send-email.txt), then it adds See linkgit:git-foo[1] at the end of description of variable in Documentation/config-vars-src.txt (taking into account continuation blocks)? What about config variables mentioned in different ways, e.g. '`foo.bar`', like in git-update-index documentation? Does it checks that 'See linkgit:git-foo[1]' already exists, e.g. in extended form 'See linkgit:git-foo[1]. True by default.'? Or is it about automatically creating and updating blocks like this: sendemail.aliasesfile:: sendemail.aliasfiletype:: sendemail.bcc:: sendemail.cc:: [...] sendemail.thread:: sendemail.validate:: See linkgit:git-send-email[1] for description. See also comments below, where I realized what this scipt does... This really should be in commit message. > Signed-off-by: Thomas Rast <trast@xxxxxxxxxxxxxxx> > --- > Documentation/Makefile | 9 +- > Documentation/config-vars-src.txt | 1747 +++++++++++++++++++++++++++++++++++ > Documentation/config-vars.txt | 1747 ----------------------------------- > Documentation/make-config-list.perl | 131 +++ > 4 files changed, 1886 insertions(+), 1748 deletions(-) > create mode 100644 Documentation/config-vars-src.txt > delete mode 100644 Documentation/config-vars.txt > create mode 100755 Documentation/make-config-list.perl [...] > diff --git a/Documentation/make-config-list.perl b/Documentation/make-config-list.perl > new file mode 100755 > index 0000000..f086867 > --- /dev/null > +++ b/Documentation/make-config-list.perl > @@ -0,0 +1,131 @@ > +#!/usr/bin/perl While other helper scripts do not include comments describing them, it would be nice to have here description what script does and for what it is used. Comments in code would also be nice. The code lacks consistency: > + open my $fh, "<", $file or die "cannot open $file: $!"; vs > + my $fp; > + open $fp, '<', $name or die "open $name failed: $!"; > + close $fh or die "eh? close failed: $!"; vs > + close $fp or die "close $name failed: $!"; > + > +use strict; > +use warnings; > +use Getopt::Long; > + > +my %ignore; > + > +my $rc = GetOptions( > + "mainlist=s" => \my $mainlistfile, > + "ignore=s" => sub { $ignore{$_[1]} = 1 }, > + "no-sort" => \my $no_sort, > + ); > + > +if (!$rc or (!-r $mainlistfile)) { > + print "$0 --mainlist=<mainlist> [--ignore=<ignore>...] <asciidoc_manpage>...\n"; > + exit 1; > +} The names <mainlist> (which is file with list of configuration variables to modify), <ignore> (which is about ignoring some asciidoc_manpage files, but only when recursing / following linkgit links) doesn't tell us much by themselves. It really needs either better names, or comment describing what they mean, or both. > + It would be good to have comment what this subroutine does. It reads and parses file with list of config variables and their description, and returns reference to list of variables, in the order they were in file, and reference to hash with contents: variable => [ lines of description of variable ] > +sub read_varlist { > + my ($file) = @_; > + > + open my $fh, "<", $file or die "cannot open $file: $!"; > + my (%mainlist, @mainvars); > + > + my ($v, $last_v); > + my $in_block = 0; > + while (<$fh>) { > + if (/^(\S+)::/) { > + $v = lc $1; > + $in_block = 0; > + push @{$mainlist{$v}}, $_; > + push @mainvars, $v; O.K., that is easy to understand (if one remembers about autovivification in Perl). But shouldn't we check if we are not in literal block, just in case? > + } elsif (/^$/ && !$in_block) { > + if (defined $last_v && !$#{$mainlist{$last_v}}) { > + $mainlist{$last_v} = $mainlist{$v}; > + } > + $last_v = $v; What is this branch / this code about? > + } elsif (defined $v) { > + push @{$mainlist{$v}}, $_; > + $in_block = !$in_block if /^--$/; O.K., easy to understand. > + } > + } > + > + close $fh or die "eh? close failed: $!"; > + > + return \%mainlist, \@mainvars; > +} > + It would be nice to have description what those variables should contain (what structure would they have). > +my %vars; > +my %sections; > + It would be good to have comment what this subroutine does; better name than 'read_file' would also be good. This subroutine does *two* things: finds _manual_ section that manpage belongs to, to create correct link (e.g. 1 for git-send-email(1), 5 for gitattributes(5), 7 for gitcli(7)), and finds all manpages that refer to config variable using '^foo.bar::': $vars{'foo.bar'} = [ 'git-foo', ... ] It would also automatically follow includes, excluding ignored files from following 'include::<filename>[]' links. > +sub read_file { > + my ($name, $main_name) = @_; $name is name of current file, $main_name is name of file that included it, isn't it? > + if (!defined $main_name) { > + $main_name = $name; > + } > + my $manpage_name = $main_name; > + $manpage_name =~ s/\.txt//; > + my $fp; > + open $fp, '<', $name or die "open $name failed: $!"; > + while (<$fp>) { > + if ($. < 5 && /^$manpage_name\((\d+)\)/) { > + $sections{$manpage_name} = $1; > + } Shouldn't it be always first line of manpage? > + if (/^([a-z0-9]+\.[a-z<>0-9.]+)::/) { > + push @{$vars{$1}}, $manpage_name; > + } > + if (/^include::\s*(\S+)\s*\[\]/ > + && !exists $ignore{$1}) { > + read_file($1, $main_name); > + } > + } > + close $fp or die "close $name failed: $!"; > +} > + > +foreach my $name (@ARGV) { > + read_file($name); > +} > + > +my ($mainlist, $mainvars) = read_varlist($mainlistfile); > + > +my @all_keys = @$mainvars; > +foreach my $k (keys %vars) { > + if (!exists $mainlist->{$k}) { > + push @all_keys, $k; > + } Nice, so it would detect config variables which are missing from config-vars-src.txt > +} > + > +@all_keys = sort @all_keys unless $no_sort; > + > +my %out; > +foreach my $k (@all_keys) { > + if (exists $mainlist->{$k}) { > + push @{$out{$k}}, @{$mainlist->{$k}}, "\n"; Ah, so it doesn't add 'See linkgit:git-foo[1]' if 'foo.bar' is present in config-vars-src.txt, but only generate notice about config variable in git-config manpage, refering to the page where it is defined and described. This should be make more clear in commit message. > + } elsif (exists $vars{$k}) { > + push @{$out{$k}}, $k . "::\n"; > + push @{$out{$k}}, "\tSee "; Wouldn't this result in something like sendemail.aliasesfile:: See linkgit:git-send-email[1] sendemail.aliasfiletype:: See linkgit:git-send-email[1] instead of sendemail.aliasesfile:: sendemail.aliasfiletype:: See linkgit:git-send-email[1] in the case where config-vars-src.txt is missing some variable? Ah, I see that it is collapsed later. > + my $sep = " "; Not $sep = "", or "\tSee"? Otherwise you would get "\tSee linkgit:git-foo[1]" with double space. > + foreach my $p (sort @{$vars{$k}}) { > + next if ($p =~ /$mainlistfile/); > + if (!exists $sections{$p}) { > + warn "section for $p unknown"; > + $sections{$p} = 1; > + } > + push @{$out{$k}}, $sep . "linkgit:" . $p . "[" . $sections{$p} . "]"; A bit of possible (but perhaps not necessary) refactoring: + push @{$out{$k}}, $sep . gen_linkgit($p); (or something like that). Besides wouldn't it be better to do collapsing based on data, not on formatted string? > + $sep = ", "; > + } > + push @{$out{$k}}, ".\n\n"; > + } else { > + die "can't happen: $k not in any source"; > + } > +} > + A comment what this loop does would be nice. Note that we don't really have to do this collapsing for existing contents; only for contents that was generated by this script. > +for (my $i = 0; $i < $#all_keys; $i++) { > + next if $#{$out{$all_keys[$i]}} != $#{$out{$all_keys[$i+1]}}; > + my $same = 1; > + for (my $j = 1; $j <= $#{$out{$all_keys[$i]}}; $j++) { > + if ($out{$all_keys[$i]}[$j] ne $out{$all_keys[$i+1]}[$j]) { > + $same = 0; > + last; > + } > + } A bit of possible (but perhaps not necessary) refactoring: + next unless eq_deeply($out{$all_keys[$i]}, $out{$all_keys[$i+1]}); (or eq_deeply_arr). > + if ($same) { > + $out{$all_keys[$i]} = [$out{$all_keys[$i]}[0]]; > + } > +} I really think that we can do this in an easier, more elegant, and not that convoluted way. First, we don't really need to store 'foo.bar::' as first element in $out{'foo.bar'}. Second, then compacting is just grouping hash by value. > + > +foreach my $k (@all_keys) { > + print @{$out{$k}}; > +} > -- > 1.7.3.1.281.g5da0b > > -- Jakub Narebski Poland -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html