(bumping a few points for v3) On Thu, Jul 9, 2020 at 10:56 AM Nathan Huckleberry <nhuck@xxxxxxxxxx> wrote: > > On Wed, Jul 8, 2020 at 2:11 PM Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote: > > > > On Wed, Jul 8, 2020 at 11:21 AM 'Nathan Huckleberry' via Clang Built > > Linux <clang-built-linux@xxxxxxxxxxxxxxxx> wrote: > > > > > I think we should add scripts/clang-tools/ to MAINTAINERS under > > CLANG/LLVM SUPPORT: > > ``` > > diff --git a/MAINTAINERS b/MAINTAINERS > > index c87b94e6b2f6..42602231929c 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -4211,6 +4211,7 @@ W: https://clangbuiltlinux.github.io/ > > B: https://github.com/ClangBuiltLinux/linux/issues > > C: irc://chat.freenode.net/clangbuiltlinux > > F: Documentation/kbuild/llvm.rst > > +F: scripts/clang-tools/ > > K: \b(?i:clang|llvm)\b > > > > CLEANCACHE API > > ``` > > that way we get cc'ed properly on proposed changes (should folks use > > scripts/get_maintainer.pl). bump > > > --- /dev/null > > > +++ b/scripts/clang-tools/run-clang-tools.py > > > @@ -0,0 +1,77 @@ > > > +#!/usr/bin/env python > > > +# SPDX-License-Identifier: GPL-2.0 > > > +# > > > +# Copyright (C) Google LLC, 2020 > > > +# > > > +# Author: Nathan Huckleberry <nhuck@xxxxxxxxxx> > > > +# > > > +"""A helper routine run clang-tidy and the clang static-analyzer on > > > +compile_commands.json.""" > > > + > > > +import argparse > > > +import json > > > +import logging > > > +import multiprocessing > > > +import os > > > +import re Is `re` being used anywhere? > > > +import subprocess > > > + > > > +def parse_arguments(): > > > + """Set up and parses command-line arguments. > > > + Returns: > > > + args: Dict of parsed args > > > + Has keys 'file' and 'type' > > > + """ > > > + usage = """Run clang-tidy or the clang static-analyzer on a > > > + compilation database.""" > > > + parser = argparse.ArgumentParser(description=usage) > > > + > > > + type_help = ('Type of analysis to be performed') > > > + parser.add_argument('type', choices=['clang-tidy', 'static-analyzer'], > > > + help=type_help) > > > + file_path_help = ('Path to the compilation database to parse') > > > + parser.add_argument('file', type=str, help=file_path_help) > > > > I don't know if the kernel has a preferred style for Python, but I > > think it would be good to be consistent in the use of single vs double > > quotes for strings. My preference is for double quotes, but I don't > > know enough about the various PEPs for style or if the kernel has a > > preferred style for these. double quotes. > > > + > > > + args = parser.parse_args() > > > + > > > + return args > > > + > > > +def init(l,t): > > > + global lock > > > + global analysis_type > > > + lock = l > > > + analysis_type = t > > > > Is this canonical Python? Maybe wrap these functions into methods of > > an object you construct, that way you can assign these as instance > > variables against `self`, rather than using global variables. > > I did this to allow shared locks between processes, see > https://stackoverflow.com/questions/25557686/python-sharing-a-lock-between-processes Ah, ok, I see the problem. In that case, I'm less worried about this. `global` just sets off red flags unless there is a very good reason to use it. > > > > > > + > > > +def run_analysis(entry): > > > + filename = entry['file'] > > > + p = None > > > + if(analysis_type == "clang-tidy"): > > > + p = subprocess.run(["clang-tidy", "-p", os.getcwd(), > > > + "-checks=-*,linuxkernel-*", filename], > > > + stdout=subprocess.PIPE, stderr=subprocess.PIPE) > > > + if(analysis_type == "static-analyzer"): > > > + p = subprocess.run(["clang-tidy", "-p", os.getcwd(), > > > + "-checks=-*,clang-analyzer-*", filename], Is it worthwhile to NOT run `-*` passes and only run `clang-analyzer-*`? Otherwise `make clang-analyzer` and `make clang-tidy` contain a ton of duplicate info. > > > + stdout=subprocess.PIPE, stderr=subprocess.PIPE) > > > > When you have a fair amount of duplication between two branches of an > > if/else (for instance, same method invocation and number of > > parameters, just slight differences in parameter values), consider if > > you can use a ternary to simplify or make the code more concise. That > > would also help avoid initializing `p` to `None`: > > > > checks = "-checks=-*,linuxkernel-*" if analysis_type == "clang-tidy" > > else "-checks=-*,clang-analyzer-*" > > p = subprocess.run(["clang-tidy", "-p", os.getcwd(), checks, > > stdout=subprocess.PIPE, stderr=subprocess.PIPE] > > > > then maybe do some validation of the analysis_type when validating > > command line arguments earlier. > > Argparse should already handle validation of the analysis type. Cool, I still think the ternary can simplify a v3. > > > > > > + lock.acquire() > > > + print(entry['file']) > > > + os.write(1, p.stdout) > > > + os.write(2, p.stderr) > > > > Please use sys.stdout and sys.stderr rather than magic constants for > > their file descriptors. Also, I'm not a fan of how clang-tidy writes the errors to stdout. $ make LLVM=1 -j71 defconfig clang-tidy 2> log.txt write part of the log, and spews to stdout. Do you think it would make sense to redirect stdout from clang-tidy to stderr for this script? $ grep error: log.txt | sort -u $ grep clang-analyzer log.txt | sort -u Checking some of the clang-tidy warnings, some seem harmless. linux-next/net/core/devlink.c:9527:6: error: redefinition of 'devlink_compat_running_version' [clang-diagnostic-error] looks legit, though not terribly important to fix ASAP. So that's cool. The clang-analyzer report is a little beefier, once the traces start getting long they become fairly hard to follow. Is it possible to dump the html report of these? I guess the issue with that is that we wouldn't be able to join them in the python script. $ grep clang-analyzer log.txt | sort -u | cut -d '[' -f 2 | sort -u clang-analyzer-core.CallAndMessage] clang-analyzer-core.DivideZero] clang-analyzer-core.NonNullParamChecker] clang-analyzer-core.NullDereference] clang-analyzer-core.StackAddressEscape] clang-analyzer-core.UndefinedBinaryOperatorResult] clang-analyzer-core.uninitialized.ArraySubscript] clang-analyzer-core.uninitialized.Assign] clang-analyzer-core.uninitialized.Branch] clang-analyzer-core.uninitialized.UndefReturn] clang-analyzer-deadcode.DeadStores] clang-analyzer-optin.performance.Padding] clang-analyzer-optin.portability.UnixAPI] clang-analyzer-security.insecureAPI.bcmp] clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling] clang-analyzer-security.insecureAPI.strcpy] clang-analyzer-unix.cstring.NullArg] clang-analyzer-unix.Malloc] clang-analyzer-valist.Uninitialized] interesting. I like how clang-analyzer warns about bcmp and yet llvm will generate calls to bcmp... -- Thanks, ~Nick Desaulniers