[PATCH v2 0/8] x86: undwarf unwinder

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



v2:

- 2x performance improvement by using a fast lookup table and splitting
  undwarf array into two parallel arrays (Andy L)
- reduce data size by ~1MB by getting rid of 'len' field
- sort and post-process data at boot time
- don't search vmlinux tables for module addresses (Peter Z)
- disable preemption to prevent module from getting unloaded while
  reading its undwarf data (Peter Z)
- avoid unwinding a running task's stack (Jiri S)
- remove '__sp' constraint from inline asm (Jiri S)
- rename "CFI_*" -> "UNWIND_HINT_*" (Andy L)
- replace '999:' label with '.Lunwind_hint_ip_\@' (Andy L)
- entry code annotation fixes: extra=0 fix, symmetrical macro
  annotations, ret_from_fork fix (Andy L)
- invalidate all object files when enabling/disabling
  CONFIG_UNDWARF_UNWINDER
- pass ip-1 to undwarf_find() for call return addresses to fix stack
  traces for sibling calls and noreturn calls at end of function
- docs: clarify benefits vs frame pointers (Ingo)
- docs: improve wording, add more info, add performance info from Mel G
  and Jiri S, move to kernel docs dir
- objtool: several minor fixes (Jiri S)
- objtool: append file instead of rewriting it
- objtool: improve elf warnings
- objtool: fix handling of the GCC DRAP register for aligned stacks
- objtool: rewrite 'undwarf dump' command to be much faster and to work
  on vmlinux
- objtool: rename undwarf.c -> undwarf_gen.c

-----

Create a new 'undwarf' unwinder, enabled by CONFIG_UNDWARF_UNWINDER, and
plug it into the x86 unwinder framework.  Objtool is used to generate
the undwarf debuginfo.  The undwarf debuginfo format is basically a
simplified version of DWARF CFI.  More details below.

The unwinder works well in my testing.  It unwinds through interrupts,
exceptions, and preemption, with and without frame pointers, across
aligned stacks and dynamically allocated stacks.  If something goes
wrong during an oops, it successfully falls back to printing the '?'
entries just like the frame pointer unwinder.

I'm not tied to the 'undwarf' name, other naming ideas are welcome.

Some potential future improvements:
- properly annotate or fix whitelisted functions and files
- reduce the number of base CFA registers needed in entry code
- compress undwarf debuginfo to use less memory
- make it easier to disable CONFIG_FRAME_POINTER
- add reliability checks for livepatch
- runtime NMI stack reliability checker

This code can also be found at:

  git://github.com/jpoimboe/linux undwarf-v2

Here's the contents of the undwarf.txt file which explains the 'why' in
more detail:


Undwarf unwinder debuginfo generation
=====================================

Overview
--------

The kernel CONFIG_UNDWARF_UNWINDER option enables objtool generation of
undwarf debuginfo, which is out-of-band data which is used by the
in-kernel undwarf unwinder.  It's similar in concept to DWARF CFI
debuginfo which would be used by a DWARF unwinder.  The difference is
that the format of the undwarf data is simpler than DWARF, which in turn
allows the unwinder to be simpler and faster.

Objtool generates the undwarf data by first doing compile-time stack
metadata validation (CONFIG_STACK_VALIDATION).  After analyzing all the
code paths of a .o file, it determines information about the stack state
at each instruction address in the file and outputs that information to
the .undwarf and .undwarf_ip sections.

The undwarf sections are combined at link time and are sorted at boot
time.  The unwinder uses the resulting data to correlate instruction
addresses with their stack states at run time.


Undwarf vs frame pointers
-------------------------

With frame pointers enabled, GCC adds instrumentation code to every
function in the kernel.  The kernel's .text size increases by about
3.2%, resulting in a broad kernel-wide slowdown.  Measurements by Mel
Gorman [1] have shown a slowdown of 5-10% for some workloads.

In contrast, the undwarf unwinder has no effect on text size or runtime
performance, because the debuginfo is out of band.  So if you disable
frame pointers and enable undwarf, you get a nice performance
improvement across the board, and still have reliable stack traces.

Another benefit of undwarf compared to frame pointers is that it can
reliably unwind across interrupts and exceptions.  Frame pointer based
unwinds can skip the caller of the interrupted function if it was a leaf
function or if the interrupt hit before the frame pointer was saved.

The main disadvantage of undwarf compared to frame pointers is that it
needs more memory to store the undwarf table: roughly 3-5MB depending on
the kernel config.


Undwarf vs DWARF
----------------

Undwarf debuginfo's advantage over DWARF itself is that it's much
simpler.  It gets rid of the complex DWARF CFI state machine and also
gets rid of the tracking of unnecessary registers.  This allows the
unwinder to be much simpler, meaning fewer bugs, which is especially
important for mission critical oops code.

The simpler debuginfo format also enables the unwinder to be much faster
than DWARF, which is important for perf and lockdep.  In a basic
performance test by Jiri Slaby [2], the undwarf unwinder was about 20x
faster than an out-of-tree DWARF unwinder.  (Note: that measurement was
taken before some performance tweaks were implemented, so the speedup
may be even higher.)

The undwarf format does have a few downsides compared to DWARF.  The
undwarf table takes up ~2MB more memory than an DWARF .eh_frame table.

Another potential downside is that, as GCC evolves, it's conceivable
that the undwarf data may end up being *too* simple to describe the
state of the stack for certain optimizations.  But IMO this is unlikely
because GCC saves the frame pointer for any unusual stack adjustments it
does, so I suspect we'll really only ever need to keep track of the
stack pointer and the frame pointer between call frames.  But even if we
do end up having to track all the registers DWARF tracks, at least we
will still be able to control the format, e.g.  no complex state
machines.


Undwarf debuginfo generation
----------------------------

The undwarf data is generated by objtool.  With the existing
compile-time stack metadata validation feature, objtool already follows
all code paths, and so it already has all the information it needs to be
able to generate undwarf data from scratch.  So it's an easy step to go
from stack validation to undwarf generation.

It should be possible to instead generate the undwarf data with a simple
tool which converts DWARF to undwarf.  However, such a solution would be
incomplete due to the kernel's extensive use of asm, inline asm, and
special sections like exception tables.

That could be rectified by manually annotating those special code paths
using GNU assembler .cfi annotations in .S files, and homegrown
annotations for inline asm in .c files.  But asm annotations were tried
in the past and were found to be unmaintainable.  They were often
incorrect/incomplete and made the code harder to read and keep updated.
And based on looking at glibc code, annotating inline asm in .c files
might be even worse.

Objtool still needs a few annotations, but only in code which does
unusual things to the stack like entry code.  And even then, far fewer
annotations are needed than what DWARF would need, so they're much more
maintainable than DWARF CFI annotations.

So the advantages of using objtool to generate undwarf are that it gives
more accurate debuginfo, with very few annotations.  It also insulates
the kernel from toolchain bugs which can be very painful to deal with in
the kernel since we often have to workaround issues in older versions of
the toolchain for years.

The downside is that the unwinder now becomes dependent on objtool's
ability to reverse engineer GCC code paths.  If GCC optimizations become
too complicated for objtool to follow, the undwarf generation might stop
working or become incomplete.  (It's worth noting that livepatch already
has such a dependency on objtool's ability to follow GCC code paths.)

If newer versions of GCC come up with some optimizations which break
objtool, we may need to revisit the current implementation.  Some
possible solutions would be asking GCC to make the optimizations more
palatable, or having objtool use DWARF as an additional input, or
creating a GCC plugin to assist objtool with its analysis.  But for now,
objtool follows GCC code quite well.


Unwinder implementation details
-------------------------------

Objtool generates the undwarf data by integrating with the compile-time
stack metadata validation feature, which is described in detail in
tools/objtool/Documentation/stack-validation.txt.  After analyzing all
the code paths of a .o file, it creates an array of undwarf structs, and
a parallel array of instruction addresses associated with those structs,
and writes them to the .undwarf and .undwarf_ip sections respectively.

The undwarf data is split into the two arrays for performance reasons,
to make the searchable part of the data (.undwarf_ip) more compact.  The
arrays are sorted in parallel at boot time.

Performance is further improved by the use of a fast lookup table which
is created at runtime.  The fast lookup table associates a given address
with a range of undwarf table indices, so that only a small subset of
the undwarf table needs to be searched.


[1] https://lkml.kernel.org/r/20170602104048.jkkzssljsompjdwy@xxxxxxx
[2] https://lkml.kernel.org/r/d2ca5435-6386-29b8-db87-7f227c2b713a@xxxxxxx


Josh Poimboeuf (8):
  objtool: move checking code to check.c
  objtool, x86: add several functions and files to the objtool whitelist
  objtool: stack validation 2.0
  objtool: add undwarf debuginfo generation
  objtool, x86: add facility for asm code to provide unwind hints
  x86/entry: add unwind hint annotations
  x86/asm: add unwind hint annotations to sync_core()
  x86/unwind: add undwarf unwinder

 Documentation/x86/undwarf.txt                    |  146 +++
 arch/um/include/asm/unwind.h                     |    8 +
 arch/x86/Kconfig                                 |    1 +
 arch/x86/Kconfig.debug                           |   25 +
 arch/x86/crypto/Makefile                         |    2 +
 arch/x86/crypto/sha1-mb/Makefile                 |    2 +
 arch/x86/crypto/sha256-mb/Makefile               |    2 +
 arch/x86/entry/Makefile                          |    1 -
 arch/x86/entry/calling.h                         |    6 +
 arch/x86/entry/entry_64.S                        |   56 +-
 arch/x86/include/asm/module.h                    |    9 +
 arch/x86/include/asm/processor.h                 |    3 +
 arch/x86/include/asm/undwarf-types.h             |   99 ++
 arch/x86/include/asm/undwarf.h                   |  103 ++
 arch/x86/include/asm/unwind.h                    |   77 +-
 arch/x86/kernel/Makefile                         |    9 +-
 arch/x86/kernel/acpi/Makefile                    |    2 +
 arch/x86/kernel/kprobes/opt.c                    |    9 +-
 arch/x86/kernel/module.c                         |   12 +-
 arch/x86/kernel/reboot.c                         |    2 +
 arch/x86/kernel/setup.c                          |    3 +
 arch/x86/kernel/unwind_frame.c                   |   39 +-
 arch/x86/kernel/unwind_guess.c                   |    5 +
 arch/x86/kernel/unwind_undwarf.c                 |  589 ++++++++++
 arch/x86/kernel/vmlinux.lds.S                    |    2 +
 arch/x86/kvm/svm.c                               |    2 +
 arch/x86/kvm/vmx.c                               |    3 +
 arch/x86/lib/msr-reg.S                           |    8 +-
 arch/x86/net/Makefile                            |    2 +
 arch/x86/platform/efi/Makefile                   |    1 +
 arch/x86/power/Makefile                          |    2 +
 arch/x86/xen/Makefile                            |    3 +
 include/asm-generic/vmlinux.lds.h                |   20 +-
 kernel/kexec_core.c                              |    4 +-
 lib/Kconfig.debug                                |    3 +
 scripts/Makefile.build                           |   14 +-
 tools/objtool/Build                              |    4 +
 tools/objtool/Documentation/stack-validation.txt |  195 ++--
 tools/objtool/Makefile                           |    5 +-
 tools/objtool/arch.h                             |   64 +-
 tools/objtool/arch/x86/decode.c                  |  400 ++++++-
 tools/objtool/builtin-check.c                    | 1281 +---------------------
 tools/objtool/builtin-undwarf.c                  |   70 ++
 tools/objtool/builtin.h                          |    1 +
 tools/objtool/cfi.h                              |   55 +
 tools/objtool/{builtin-check.c => check.c}       |  954 ++++++++++++----
 tools/objtool/check.h                            |   79 ++
 tools/objtool/elf.c                              |  265 ++++-
 tools/objtool/elf.h                              |   21 +-
 tools/objtool/objtool.c                          |    3 +-
 tools/objtool/special.c                          |    6 +-
 tools/objtool/undwarf-types.h                    |   99 ++
 tools/objtool/{builtin.h => undwarf.h}           |   18 +-
 tools/objtool/undwarf_dump.c                     |  212 ++++
 tools/objtool/undwarf_gen.c                      |  215 ++++
 tools/objtool/warn.h                             |   10 +
 56 files changed, 3466 insertions(+), 1765 deletions(-)
 create mode 100644 Documentation/x86/undwarf.txt
 create mode 100644 arch/um/include/asm/unwind.h
 create mode 100644 arch/x86/include/asm/undwarf-types.h
 create mode 100644 arch/x86/include/asm/undwarf.h
 create mode 100644 arch/x86/kernel/unwind_undwarf.c
 create mode 100644 tools/objtool/builtin-undwarf.c
 create mode 100644 tools/objtool/cfi.h
 copy tools/objtool/{builtin-check.c => check.c} (59%)
 create mode 100644 tools/objtool/check.h
 create mode 100644 tools/objtool/undwarf-types.h
 copy tools/objtool/{builtin.h => undwarf.h} (67%)
 create mode 100644 tools/objtool/undwarf_dump.c
 create mode 100644 tools/objtool/undwarf_gen.c

-- 
2.7.5

--
To unsubscribe from this list: send the line "unsubscribe live-patching" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux