Re: [PATCH v3 03/18] perf capstone: Move capstone functionality into its own file

Namhyung Kim <namhyung@xxxxxxxxxx> · Tue, 28 Jan 2025 12:56:22 -0800

On Fri, Jan 24, 2025 at 04:59:21PM -0800, Ian Rogers wrote:
> On Fri, Jan 24, 2025 at 1:58 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >
> > On Wed, Jan 22, 2025 at 09:42:53AM -0800, Ian Rogers wrote:
> > > Capstone disassembly support was split between disasm.c and
> > > print_insn.c. Move support out of these files into capstone.[ch] and
> > > remove include capstone/capstone.h from those files. As disassembly
> > > routines can fail, make failure the only option without
> > > HAVE_LIBCAPSTONE_SUPPORT. For simplicity's sake, duplicate the
> > > read_symbol utility function.
> > >
> > > The intent with moving capstone support into a single file is that
> > > dynamic support, using dlopen for libcapstone, can be added in later
> > > patches. This can potentially always succeed or fail, so relying on
> > > ifdefs isn't sufficient. Using dlopen is a useful option to minimize
> > > the perf tools dependencies and potentially size.
> > >
> > > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > > ---
> > >  tools/perf/builtin-script.c  |   2 -
> > >  tools/perf/util/Build        |   1 +
> > >  tools/perf/util/capstone.c   | 536 +++++++++++++++++++++++++++++++++++
> > >  tools/perf/util/capstone.h   |  24 ++
> > >  tools/perf/util/disasm.c     | 358 +----------------------
> > >  tools/perf/util/print_insn.c | 117 +-------
> > >  6 files changed, 569 insertions(+), 469 deletions(-)
> > >  create mode 100644 tools/perf/util/capstone.c
> > >  create mode 100644 tools/perf/util/capstone.h
> > >
> > > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> > > index 33667b534634..f05b2b70d5a7 100644
> > > --- a/tools/perf/builtin-script.c
> > > +++ b/tools/perf/builtin-script.c
> > > @@ -1200,7 +1200,6 @@ static int any_dump_insn(struct evsel *evsel __maybe_unused,
> > >                        u8 *inbuf, int inlen, int *lenp,
> > >                        FILE *fp)
> > >  {
> > > -#ifdef HAVE_LIBCAPSTONE_SUPPORT
> > >       if (PRINT_FIELD(BRSTACKDISASM)) {
> > >               int printed = fprintf_insn_asm(x->machine, x->thread, x->cpumode, x->is64bit,
> > >                                              (uint8_t *)inbuf, inlen, ip, lenp,
> > > @@ -1209,7 +1208,6 @@ static int any_dump_insn(struct evsel *evsel __maybe_unused,
> > >               if (printed > 0)
> > >                       return printed;
> > >       }
> > > -#endif
> > >       return fprintf(fp, "%s", dump_insn(x, ip, inbuf, inlen, lenp));
> > >  }
> > >
> > > diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> > > index 5ec97e8d6b6d..9542decf9625 100644
> > > --- a/tools/perf/util/Build
> > > +++ b/tools/perf/util/Build
> > > @@ -8,6 +8,7 @@ perf-util-y += block-info.o
> > >  perf-util-y += block-range.o
> > >  perf-util-y += build-id.o
> > >  perf-util-y += cacheline.o
> > > +perf-util-y += capstone.o
> > >  perf-util-y += config.o
> > >  perf-util-y += copyfile.o
> > >  perf-util-y += ctype.o
> > > diff --git a/tools/perf/util/capstone.c b/tools/perf/util/capstone.c
> > > new file mode 100644
> > > index 000000000000..c0a6d94ebc18
> > > --- /dev/null
> > > +++ b/tools/perf/util/capstone.c
> > > @@ -0,0 +1,536 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +#include "capstone.h"
> > > +#include "annotate.h"
> > > +#include "addr_location.h"
> > > +#include "debug.h"
> > > +#include "disasm.h"
> > > +#include "dso.h"
> > > +#include "machine.h"
> > > +#include "map.h"
> > > +#include "namespaces.h"
> > > +#include "print_insn.h"
> > > +#include "symbol.h"
> > > +#include "thread.h"
> > > +#include <fcntl.h>
> > > +#include <string.h>
> > > +
> > > +#ifdef HAVE_LIBCAPSTONE_SUPPORT
> > > +#include <capstone/capstone.h>
> > > +#endif
> >
> > I think you can use a big #ifdef throughout the file to minimize the
> > #ifdef dances.  Usually it goes to the header to provide dummy static
> > inlines and make the .c file depends on config.  But I know you will
> > add dlopen code for the #else case later.
> 
> So I think big ifdefs like:
> 
> #if HAVE_xyz
> // 100s of lines
> #else
> // 100s of lines
> #endif
> 
> are best avoided. It is also the point of the shim-ing that we do
> 
> ... perf_foobar(...)
> {
> #if NO_SHIM
>   ... foobar(...);
> #else
>   //dlsym code
> #endif
> }
> 
> Having the shimming and not shimming as two separate functions buried
> in a 100 #ifdef loses that the code is common except for the shimming.

Right, can we split the common part and move it out of #ifdef?

> 
> For example, in the current code we have find_file_offset:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/disasm.c?h=perf-tools-next&id=91b7747dc70d64b5ec56ffe493310f207e7ffc99#n1371
> 
> It is only possible to understand the use of this seemingly common
> code by trying to interpret what's going on with the #ifdefs.
> 
> I think it stylistically it is okay to have multiple stubbed out
> functions inside a #if or #else, such as:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/include/linux/perf_event.h?h=perf-tools-next&id=91b7747dc70d64b5ec56ffe493310f207e7ffc99#n1797

I think the convention in the kernel community is that it's better to
remove #ifdef's in .c files and add a dummy functions under !condition
in .h files.  If that's not possible, I think we should make the code
less conditional by minimizing the #ifdef's.

> 
> But when the logic is shared and all in one file it becomes next to
> impossible to determine what's in use and what's not. Other than by
> tweaking things and trying to get build errors.
> 
> So for the shims I've placed the #if inside the function to make it
> clear the function is a shim. For the other functions that are over
> 100s of lines, for clarity the individual functions have #if
> HAVE_LIBLLVM_SUPPORT around them to make it clear that the function
> only has a meaning in that context - ie the source code doesn't make
> you go on a #ifdef finding expedition to try to understand when the
> code is in use.

I think the both approaches have their own pros and cons.  Some people
prefer one and others may have different opinions.  I think the big
conditional block is better and easy to follow.  Maybe we cannot agree
on this.  Then I believe it'd be better to follow the convention, no?

Thanks,
Namhyung