Hi Mark! On Fri, 18 Oct 2019 18:41:02 +0100 Mark Rutland <mark.rutland@xxxxxxx> wrote: > In the process of reworking this I spotted some issues that will get > in the way of livepatching. Notably: > > * When modules can be loaded far away from the kernel, we'll > potentially need a PLT for each function within a module, if each can > be patched to a unique function. Currently we have a fixed number, > which is only sufficient for the two ftrace entry trampolines. > > IIUC, the new code being patched in is itself a module, in which > case we'd need a PLT for each function in the main kernel image. When no live patching is involved, obviously all cases need to have been handled so far. And when a live patching module comes in, there are calls in and out of the new patch code: Calls going into the live patch are not aware of this. They are caught by an active ftrace intercept, and the actual call into the LP module is done in klp_arch_set_pc, by manipulating the intercept (call site) return address (in case thread lives in the "new world", for completeness' sake). This is an unsigned long write in C. All calls going _out_ from the KLP module are newly generated, as part of the KLP module building process, and are thus aware of them being "extern" -- a PLT entry should be generated and accounted for in the KLP module. > We have a few options here, e.g. changing which memory size model we > use, or reserving space for a PLT before each function using > -f patchable-function-entry=N,M. Nonetheless I'm happy I once added the ,M option here. You never know :) > * There are windows where backtracing will miss the callsite's caller, > as its address is not live in the LR or existing chain of frame > records. Thus we cannot claim to have a reliable stacktrace. > > I suspect we'll have to teach the stacktrace code to handle this as > a special-case. Yes, that's where I had to step back. The unwinder needs to stop where the chain is even questionable. In _all_ cases. Missing only one race condition means a lurking inconsistency. OTOH it's not a problem to report "not reliable" when in doubt; the thread in question will then get woken up and unwind itself. It is only an optimisation to let all kernel threads which are guaranteed to not contain any patched functions sleep on. > I'll try to write these up, as similar probably applies to other > architectures with a link register. I thought I'd quickly give you my feedback upfront here. Torsten