Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation

Nick Desaulniers <ndesaulniers@xxxxxxxxxx> · Thu, 13 Apr 2023 10:04:05 -0700

On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
> Hi Madhavan,
> 
> At a high-level, I think this still falls afoul of our desire to not reverse
> engineer control flow from the binary, and so I do not think this is the right
> approach. I've expanded a bit on that below.
> 
> I do think it would be nice to have *some* of the objtool changes, as I do
> think we will want to use objtool for some things in future (e.g. some
> build-time binary patching such as table sorting).
> 
> > Problem
> > =======
> > 
> > Objtool is complex and highly architecture-dependent. There are a lot of
> > different checks in objtool that all of the code in the kernel must pass
> > before livepatch can be enabled. If a check fails, it must be corrected
> > before we can proceed. Sometimes, the kernel code needs to be fixed.
> > Sometimes, it is a compiler bug that needs to be fixed. The challenge is
> > also to prove that all the work is complete for an architecture.
> > 
> > As such, it presents a great challenge to enable livepatch for an
> > architecture.
> 
> There's a more fundamental issue here in that objtool has to reverse-engineer
> control flow, and so even if the kernel code and compiled code generation is
> *perfect*, it's possible that objtool won't recognise the structure of the
> generated code, and won't be able to reverse-engineer the correct control flow.
> 
> We've seen issues where objtool didn't understand jump tables, so support for
> that got disabled on x86. A key objection from the arm64 side is that we don't
> want to disable compile code generation strategies like this. Further, as
> compiles evolve, their code generation strategies will change, and it's likely
> there will be other cases that crop up. This is inherently fragile.
> 
> The key objections from the arm64 side is that we don't want to
> reverse-engineer details from the binary, as this is complex, fragile, and
> unstable. This is why we've previously suggested that we should work with
> compiler folk to get what we need.

> This still requires reverse-engineering the forward-edge control flow in order
> to compute those offets, so the same objections apply with this approach. I do
> not think this is the right approach.
> 
> I would *strongly* prefer that we work with compiler folk to get the
> information that we need.

IDK if it's relevant here, but I did see a commit go by to LLVM that
seemed to include such info in a custom ELF section (for the purposes of
improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
to see if it's reliable or usable?
- https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
- https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow

> 
> [...]
> 
> > 		FWIW, I have also compared the CFI I am generating with DWARF
> > 		information that the compiler generates. The CFIs match a
> > 		100% for Clang. In the case of gcc, the comparison fails
> > 		in 1.7% of the cases. I have analyzed those cases and found
> > 		the DWARF information generated by gcc is incorrect. The
> > 		ORC generated by my Objtool is correct.
> 
> 
> Have you reported this to the GCC folk, and can you give any examples?
> I'm sure they would be interested in fixing this, regardless of whether we end
> up using it.

Yeah, at least a bug report is good. "See something, say something."