On Fri, Aug 7, 2020 at 2:28 AM <peterz@xxxxxxxxxxxxx> wrote: > > > One long standing annoyance I have with using vim-tags is that our tags > file is not properly sorted. That is, the sorting exhuberant Ctags does > is only on the tag itself. > > The problem with that is that, for example, the tag 'mutex' appears a > mere 505 times, 492 of those are structure members. However it is _far_ > more likely that someone wants the struct definition when looking for > the mutex tag than any of those members. However, due to the nature of > the sorting, the struct definition will not be first. > > So add a script that does a custom sort of the tags file, taking the tag > kind into account. > > The kind ordering is roughly: 'type', 'function', 'macro', 'enum', rest. > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > --- > Changes since v1: > - removed the need for tags.unsorted by using a pipe > > Due to this change 'make tags' is now actually faster than it was before > due to less sorting. > > scripts/sort-tags.awk | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++ > scripts/tags.sh | 11 +++++-- > 2 files changed, 87 insertions(+), 3 deletions(-) > > diff --git a/scripts/sort-tags.awk b/scripts/sort-tags.awk > new file mode 100755 > index 000000000000..1eb50406c9d3 > --- /dev/null > +++ b/scripts/sort-tags.awk > @@ -0,0 +1,79 @@ > +#!/usr/bin/awk -f > + > +# $ ctags --list-kinds > +# C > +# c classes > +# s structure names > +# t typedefs > +# g enumeration names > +# u union names > +# n namespaces > + > +# f function definitions > +# p function prototypes [off] > +# d macro definitions > + > +# e enumerators (values inside an enumeration) > +# m class, struct, and union members > +# v variable definitions > + > +# l local variables [off] > +# x external and forward variable declarations [off] > + > +BEGIN { > + FS = "\t" > + > + sort = "LC_ALL=C sort" > + > + # our sort order for C kinds: > + order["c"] = "A" > + order["s"] = "B" > + order["t"] = "C" > + order["g"] = "D" > + order["u"] = "E" > + order["n"] = "F" > + order["f"] = "G" > + order["p"] = "H" > + order["d"] = "I" > + order["e"] = "J" > + order["m"] = "K" > + order["v"] = "L" > + order["l"] = "M" > + order["x"] = "N" > +} > + > +# pass through header > +/^!_TAG/ { > + print $0 > + next > +} > + > +{ > + # find 'kinds' > + for (i = 1; i <= NF; i++) { > + if ($i ~ /;"$/) { > + kind = $(i+1) > + break; > + } > + } > + > + # create sort key > + if (order[kind]) > + key = $1 order[kind]; > + else > + key = $1 "Z"; > + > + # get it sorted > + print key "\t" $0 |& sort > +} > + > +END { > + close(sort, "to") > + while ((sort |& getline) > 0) { > + # strip key > + sub(/[^[:space:]]*[[:space:]]*/, "") > + print $0 > + } > + close(sort) > +} > + > diff --git a/scripts/tags.sh b/scripts/tags.sh > index 4e18ae5282a6..51087c3d8b1e 100755 > --- a/scripts/tags.sh > +++ b/scripts/tags.sh > @@ -251,8 +251,10 @@ setup_regex() > > exuberant() > { > + ( > + > setup_regex exuberant asm c > - all_target_sources | xargs $1 -a \ > + all_target_sources | xargs $1 \ > -I __initdata,__exitdata,__initconst,__ro_after_init \ > -I __initdata_memblock \ > -I __refdata,__attribute,__maybe_unused,__always_unused \ > @@ -266,12 +268,15 @@ exuberant() > -I DEFINE_TRACE,EXPORT_TRACEPOINT_SYMBOL,EXPORT_TRACEPOINT_SYMBOL_GPL \ > -I static,const \ > --extra=+fq --c-kinds=+px --fields=+iaS --langmap=c:+.h \ > + --sort=no -o - \ > "${regex[@]}" > > setup_regex exuberant kconfig > - all_kconfigs | xargs $1 -a \ > - --langdef=kconfig --language-force=kconfig "${regex[@]}" > + all_kconfigs | xargs $1 \ > + --langdef=kconfig --language-force=kconfig --sort=no \ > + -o - "${regex[@]}" > > + ) | scripts/sort-tags.awk > tags > } > > emacs() Sorry for the long delay. First, this patch breaks 'make TAGS' if 'etags' is a symlink to exuberant ctags. masahiro@oscar:~/ref/linux$ etags --version Exuberant Ctags 5.9~svn20110310, Copyright (C) 1996-2009 Darren Hiebert Addresses: <dhiebert@xxxxxxxxxxxxxxxxxxxxx>, http://ctags.sourceforge.net Optional compiled features: +wildcards, +regex masahiro@oscar:~/ref/linux$ make TAGS GEN TAGS etags: Warning: include/linux/seqlock.h:738: null expansion of name pattern "\2" sed: can't read TAGS: No such file or directory make: *** [Makefile:1820: TAGS] Error 2 The reason is the hard-coded ' > tags', and easy to fix. But, honestly, I am not super happy about this patch. Reason 1 In my understanding, sorting by the tag kind only works for ctags. My favorite editor is emacs. (Do not get me wrong. I do not intend emacs vs vi war). So, I rather do 'make TAGS' instead of 'make tags', but this solution would not work for etags because etags has a different format. So, I'd rather want to see a more general solution. Reason 2 We would have more messy code, mixing two files/languages When is it useful to tag structure members? If they are really annoying, why don't we delete them instead of moving them to the bottom of the tag file? I attached an alternative solution, and wrote up my thoughts in the log. What do you think? -- Best Regards Masahiro Yamada
From 1a003fce7e4f8460ef3256fb5d958fb5c6cc631e Mon Sep 17 00:00:00 2001 From: Masahiro Yamada <masahiroy@xxxxxxxxxx> Date: Wed, 2 Sep 2020 21:49:59 +0900 Subject: [PATCH] scripts/tags.sh: remove m, v, x tag kinds from exuberant tags Exuberant Ctags supports the following kinds of tags: $ ctags --list-kinds=c c classes d macro definitions e enumerators (values inside an enumeration) f function definitions g enumeration names l local variables [off] m class, struct, and union members n namespaces p function prototypes [off] s structure names t typedefs u union names v variable definitions x external and forward variable declarations [off] This commit excludes 'm', 'v', and 'x'. Peter Zijlstra states: "The problem with that is that, for example, the tag 'mutex' appears a mere 505 times, 492 of those are structure members. However it is _far_ more likely that someone wants the struct definition when looking for the mutex tag than any of those members." (https://lkml.org/lkml/2020/8/6/512") So, 'm' is rather annoying than useful. For the same reason, it seems better to turn off 'v'. You may argue about the criteria, but we need to draw a line somewhere to make it reasonable for the majority of people. We flipped 'p' and 'x' in the past: [1] commit f6333eb4e788 ("kbuild: Add ctags support for function prototypes and external variable declarations") added 'p' and 'x', but did not explain when they are actually useful. [2] commit 7db86dc97fb0 ("ctags: usability fix") removed 'p' and 'x', stating both of them make no real sense. [3] commit 0a18a9386c05 ("tags: put function prototypes back!") re-added 'p' and 'x', but the commit log only mentioned 'p'. OK, [3] clearly explained why 'p' is useful, but turned --c-kinds=+px into --c-kinds=-px. So, 'x' was also (accidentally?) disabled. I think it should have been --c-kinds=+p-x, or more simply --c-kinds=+p since 'x' is off by default. It seems a bug of [3], so I disabled 'x' to get back the pre-[3] behavior. 'make tags' and 'make TAGS' will run faster, create much smaller tags if Ctags is exuberant. Reviewed-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx> --- scripts/tags.sh | 15 +-------------- 1 file changed, 1 insertion(+), 14 deletions(-) diff --git a/scripts/tags.sh b/scripts/tags.sh index 32d3f53af10b..440f4ecad43b 100755 --- a/scripts/tags.sh +++ b/scripts/tags.sh @@ -179,19 +179,6 @@ regex_c=( '/^DEF_PCI_AC_\(\|NO\)RET(\([[:alnum:]_]*\).*/\2/' '/^PCI_OP_READ(\(\w*\).*[1-4])/pci_bus_read_config_\1/' '/^PCI_OP_WRITE(\(\w*\).*[1-4])/pci_bus_write_config_\1/' - '/\<DEFINE_\(RT_MUTEX\|MUTEX\|SEMAPHORE\|SPINLOCK\)(\([[:alnum:]_]*\)/\2/v/' - '/\<DEFINE_\(RAW_SPINLOCK\|RWLOCK\|SEQLOCK\)(\([[:alnum:]_]*\)/\2/v/' - '/\<DECLARE_\(RWSEM\|COMPLETION\)(\([[:alnum:]_]\+\)/\2/v/' - '/\<DECLARE_BITMAP(\([[:alnum:]_]*\)/\1/v/' - '/\(^\|\s\)\(\|L\|H\)LIST_HEAD(\([[:alnum:]_]*\)/\3/v/' - '/\(^\|\s\)RADIX_TREE(\([[:alnum:]_]*\)/\2/v/' - '/\<DEFINE_PER_CPU([^,]*, *\([[:alnum:]_]*\)/\1/v/' - '/\<DEFINE_PER_CPU_SHARED_ALIGNED([^,]*, *\([[:alnum:]_]*\)/\1/v/' - '/\<DECLARE_WAIT_QUEUE_HEAD(\([[:alnum:]_]*\)/\1/v/' - '/\<DECLARE_\(TASKLET\|WORK\|DELAYED_WORK\)(\([[:alnum:]_]*\)/\2/v/' - '/\(^\s\)OFFSET(\([[:alnum:]_]*\)/\2/v/' - '/\(^\s\)DEFINE(\([[:alnum:]_]*\)/\2/v/' - '/\<\(DEFINE\|DECLARE\)_HASHTABLE(\([[:alnum:]_]*\)/\2/v/' '/\<DEFINE_ID\(R\|A\)(\([[:alnum:]_]\+\)/\2/' '/\<DEFINE_WD_CLASS(\([[:alnum:]_]\+\)/\1/' '/\<ATOMIC_NOTIFIER_HEAD(\([[:alnum:]_]\+\)/\1/' @@ -255,7 +242,7 @@ exuberant() -I EXPORT_SYMBOL,EXPORT_SYMBOL_GPL,ACPI_EXPORT_SYMBOL \ -I DEFINE_TRACE,EXPORT_TRACEPOINT_SYMBOL,EXPORT_TRACEPOINT_SYMBOL_GPL \ -I static,const \ - --extra=+fq --c-kinds=+px --fields=+iaS --langmap=c:+.h \ + --extra=+fq --c-kinds=+p-mv --fields=+iaS --langmap=c:+.h \ "${regex[@]}" setup_regex exuberant kconfig -- 2.25.1