On 4/13/23 13:15, Jose E. Marchesi wrote: > >> On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote: >>> Hi Madhavan, >>> >>> At a high-level, I think this still falls afoul of our desire to not reverse >>> engineer control flow from the binary, and so I do not think this is the right >>> approach. I've expanded a bit on that below. >>> >>> I do think it would be nice to have *some* of the objtool changes, as I do >>> think we will want to use objtool for some things in future (e.g. some >>> build-time binary patching such as table sorting). >>> >>>> Problem >>>> ======= >>>> >>>> Objtool is complex and highly architecture-dependent. There are a lot of >>>> different checks in objtool that all of the code in the kernel must pass >>>> before livepatch can be enabled. If a check fails, it must be corrected >>>> before we can proceed. Sometimes, the kernel code needs to be fixed. >>>> Sometimes, it is a compiler bug that needs to be fixed. The challenge is >>>> also to prove that all the work is complete for an architecture. >>>> >>>> As such, it presents a great challenge to enable livepatch for an >>>> architecture. >>> >>> There's a more fundamental issue here in that objtool has to reverse-engineer >>> control flow, and so even if the kernel code and compiled code generation is >>> *perfect*, it's possible that objtool won't recognise the structure of the >>> generated code, and won't be able to reverse-engineer the correct control flow. >>> >>> We've seen issues where objtool didn't understand jump tables, so support for >>> that got disabled on x86. A key objection from the arm64 side is that we don't >>> want to disable compile code generation strategies like this. Further, as >>> compiles evolve, their code generation strategies will change, and it's likely >>> there will be other cases that crop up. This is inherently fragile. >>> >>> The key objections from the arm64 side is that we don't want to >>> reverse-engineer details from the binary, as this is complex, fragile, and >>> unstable. This is why we've previously suggested that we should work with >>> compiler folk to get what we need. >> >>> This still requires reverse-engineering the forward-edge control flow in order >>> to compute those offets, so the same objections apply with this approach. I do >>> not think this is the right approach. >>> >>> I would *strongly* prefer that we work with compiler folk to get the >>> information that we need. >> >> IDK if it's relevant here, but I did see a commit go by to LLVM that >> seemed to include such info in a custom ELF section (for the purposes of >> improving fuzzing, IIUC). Maybe such an encoding scheme could be tested >> to see if it's reliable or usable? >> - https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3 >> - https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow >> >>> >>> [...] >>> >>>> FWIW, I have also compared the CFI I am generating with DWARF >>>> information that the compiler generates. The CFIs match a >>>> 100% for Clang. In the case of gcc, the comparison fails >>>> in 1.7% of the cases. I have analyzed those cases and found >>>> the DWARF information generated by gcc is incorrect. The >>>> ORC generated by my Objtool is correct. >>> >>> >>> Have you reported this to the GCC folk, and can you give any examples? >>> I'm sure they would be interested in fixing this, regardless of whether we end >>> up using it. >> >> Yeah, at least a bug report is good. "See something, say something." > > By all means, please. If you guys report these issues on CFI > divergences in the GCC bugzilla, we will look into fixing them. > > https://gcc.gnu.org/bugzilla I will try to get the data again and report the problems that I see. Thanks. Madhavan