Re: [PATCH v11 03/20] x86/stackvalidate: Compile-time stack validation

Josh Poimboeuf <jpoimboe@xxxxxxxxxx> · Thu, 27 Aug 2015 09:29:53 -0500

On Wed, Aug 26, 2015 at 04:26:28PM +0200, Andi Kleen wrote:
> > b) 100% reliable stack traces for DWARF enabled kernels
> > 
> >    This is not yet implemented.  See Documentation/stack-validation.txt
> >    for more details about what is planned.
> 
> The automatic CFI generation tool seems like a bad idea to me. There's not
> that much assembler code in Linux, and often when new assembler code is added
> it is something tricky. In this case you may end up spending more time
> fixing the tool than just fixing the assembler.
> 
> It would be also quite bad to require people who want to add some
> new assembler code to learn how to fix your tool to make their
> assembler work.

Really I don't see that being much of a problem.  The enforced rules
were probably too stringent in earlier versions of the patch set.  But I
relaxed the rules quite a bit and now they allow things like sibling
calls, jumps to outside the function, alternatives, etc.

Generally it's easy and straightforward to follow the rules and make the
tool happy.  If you're trying to do something too weird, then yes the
tool will complain, but IMO that's a good thing as it really makes you
consider whether the weirdness (and associated complexity) is justified.

I tried to document everything an asm coder would need to know.  Also I
have an invested interest in keeping the tool working and useful, and
I'm listed in the MAINTAINERS file.  So any frustrated people will know
who to yell at.

> It also wouldn't surprise me if there are some possible assembler tricks
> that are very hard/impossible to handle for a tool. For example how do you 
> have alternative() style patching? (that's a generic problem with
> your approach BTW)

Yeah, there are definitely a lot of these tricks, but the tool can
already handle them today: alternatives, jump labels, exception tables,
gcc switch jump tables.  It's usually a matter of parsing a special
section and then treating these alternative code paths as conditional
branches which need to be recursively followed and analyzed.

When such conditional branches converge, it ensures that they have the
same debug state at the convergence point.

> Doing some kind of CFI verifier would seem more feasible,  but it would
> need a black/white list to override it to handle the above cases.

CFI generation isn't any harder than validation.  Either way the tool
needs to do the code analysis which involves recursively following all
branches and associating a debug state with each instruction.  At the
end it either compares those debug states with existing CFI, or writes
them as new CFI.

Also there are already some whitelist mechanisms for ignoring files,
functions, or instructions.

> BTW how do handle the increasing number of JITs in the kernel?

Yeah, compile-time CFI wouldn't be applicable for code which is
generated at runtime.  Maybe we will need a mechanism to allow eBPF to
quickly create minimal CFI-like metadata corresponding to the JIT code
it generates, which can be used by stack dumping code to identify the
JIT code and find the previous stack pointer on the stack.

We can also add a debug NMI handler which validates stacks periodically
to ensure that stacks are always sane.

-- 
Josh
--
To unsubscribe from this list: send the line "unsubscribe live-patching" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html