On Wed, Mar 09, 2016 at 03:01:46PM +0100, Petr Mladek wrote: > LivePatch framework deserves some documentation, definitely. > This is an attempt to provide some basic info. I hope that > it will be useful for both LivePatch producers and also > potential developers of the framework itself. > > Signed-off-by: Petr Mladek <pmladek@xxxxxxxx> Hi Petr, This is awesome. Thank you for starting this! Some minor nits below. > This patch was motivated by the LivePatch port for PPC. The guys > might want to document some PPC-specific limitations on top of it. > > I am sure that it is far from perfect. But I hope that it is > an acceptable start that can be improved later. I hope that > I did not write that many factual mistakes. > > I wrote only some generic info about the consistency model. > I am not sure if we have agreed on some specification yet. > > I am sorry for grammar mistakes. I hope that some hairs will > stay on your head if you are sensitive. > > Documentation/livepatch/livepatch.txt | 277 ++++++++++++++++++++++++++++++++++ > MAINTAINERS | 1 + > 2 files changed, 278 insertions(+) > create mode 100644 Documentation/livepatch/livepatch.txt > > diff --git a/Documentation/livepatch/livepatch.txt b/Documentation/livepatch/livepatch.txt > new file mode 100644 > index 000000000000..28e8047abb61 > --- /dev/null > +++ b/Documentation/livepatch/livepatch.txt > @@ -0,0 +1,277 @@ > +========= > +LivePatch > +========= At the risk of sounding like a broken record, I have to say that I really don't like the CamelCase in LivePatch. I don't have a good reason other than it gives me flashbacks to being exposed to EnterpriseApplicationObjectOrientedProgramming. And there's no existing precedent... # git grep LivePatch # ...so can we please not create one now? :-) </end grammar pedantry> > + > +This document outlines basic information about kernel LivePatching. > + > +Table of Contents: > + > +1. Motivation > +2. Kprobes, Ftrace, LivePatching > +3. Consistency model > +4. LivePatch life-cycle > + 4.1. Registration > + 4.2. Enabling > + 4.3. Disabling > + 4.4. Unregistration > +5. Livepatch module > + 5.1. New functions > + 5.2. Metadata > + 5.3. Module handling > +6. Sysfs > +7. Limitations > + > + > +1. Motivation > +============= > + > +There are situations when people are really reluctant to reboot a system. > +It might be because the computer is in the middle of a complex scientific > +computation. Or the system is busy handling customer requests in the high > +season. > + > +On the other hand, people also want to keep the system stable and secure. > +This is where LivePatch infrastructure comes handy. It allows to redirect > +selected function calls to a fixed implementation without rebooting > +the system. > + > + > +2. Kprobes, Ftrace, LivePatching > +================================ > + > +Linux kernel has more ways how to redirect an existing code into a new one. > +It happens with kernel probes, function tracing, and LivePatching: > + > + + The kernel probes are the most generic way. The code can be redirected > + by putting an interrupt instruction instead of any instruction. > + > + + The function tracer calls the code from a predefined location that is > + close the function entry. The location is generated by the compiler, > + see -pg gcc option. > + > + + LivePatching typically needs to redirect the code at the very beginning > + of the function entry before the function parameters or the stack > + are anyhow muffled. > + > +All three approaches need to modify the existing code at runtime. Therefore > +they need to be aware of each other and do not step over othres' toes. Most > +of these problems are solved by using the dynamic ftrace framework as a base. > +A Kprobe is registered as a ftrace handler when the function entry is probed, > +see CONFIG_KPROBES_ON_FTRACE. Also an alternative function from a live patch > +is called from a custom ftrace handler. But there are some limitations, > +see below. > + > + > +3. Consistency model > +==================== > + > +Functions are there for a reason. They take some input parameters, get or > +release locks, read, process, and even write some data in a defined way, > +have return values. By other words, each function has a defined semantic. > + > +Many fixes do not change the semantic of the modified functions. For example, > +they add a NULL pointer or a boundary check, fix a race by adding a missing > +memory barrier, or add some locking about a critical section. Most of these > +changes are self contained and the function present itself the same way > +to the rest of the system. In this case, the functions might be updated > +independently one by one. > + > +But there are more complex fixes. For example, a patch might change > +ordering of locking in more functions at the same time. Or a patch > +might exchange meaning of some temporary structures and update > +all the relevant functions. In this case, the affected unit > +(thread, whole kernel) need to start using all new versions of > +the functions at the same time. Also the switch must happen only > +when it is safe to do so, e.g. when the affected locks are released, > +the data using the modified structures are empty. I can't grok "the data using the modified structures are empty". Can you clarify? > +The theory about how to apply functions a safe way is rather complex. > +The aim is to define a so-called consistency model. It means to define > +conditions when the new implementation could be used so that the system > +stays consistent. The theory is not yet finished. See the discussion at > +http://thread.gmane.org/gmane.linux.kernel/1823033/focus=1828189 > + > +The current implementation supports the easiest scenario that > +is sufficeint for the most common fixes. See the limitations below. > + > + > +4. LivePatch life-cycle > +======================= > + > +LivePatching defines four basic operations that define the life cycle > +of each live patch. > + > +4.1. Registration > +----------------- > + > +Each patch has to be registered using klp_register_patch(). > + > +Here the patch is added into the list of known patches. The addresses > +of the patched functions are found according to their names. > +Relocations are applied. The relevant entries are created under > +/sys/kernel/livepatch/<name>. > + > + > +4.2. Enabling > +------------- > + > +Registered patches might be enabled either by calling klp_enable_patch() or > +by writing '1' to /sys/kernel/livepatch/<name>/enabled. > + > +At this stage, an universal ftrace handler is registered for all newly patched > +functions with a function-specific ftrace_ops structure. The structure points > +to a list of struct klp_func, see func_stack. This way the same function > +can be patched more times. The last variant from the func_stack is used. > + > +Note that we could enable patches in a different order than they are > +registered. The actually used function is defined by the order in > +the func_stack list. > + > + > +4.3. Disabling > +-------------- > + > +Enabled patches might get disabled either by calling klp_disable_patch() or > +by writing '0' to /sys/kernel/livepatch/<name>/enabled. > + > +Here all the struct klp_functions are removed from the appropriate > +ftrace_ops. The ftrace handler is unregistered when the func_stack > +list gets empty. > + > +Patches must be disabled in the exactly revese order in which they were > +enabled. It makes the problem and the implementation easier. > + > + > +4.4. Unregistration > +------------------- > + > +Disabled patched might be unregistered by calling klp_unregister_patch(). > + > +At this stage, all the relevant sys-fs entries are removed and the patch > +is removed from the list of known patches. > + > + > +5. Livepatch module > +=================== > + > +Live patches are distributed using kernel modules, see > +samples/livepatch/livepatch-sample.c. > + > +The module includes a new implementation of functions that we want > +to replace. In addition, it defines some structures describing what > +functions are replaced. Finally, there is a code for registering, > +enabling, and unregistering the patch. > + > + > +5.1. New functions > +------------------ > + > +New versions of functions are typically just copied from the fixed sources. > +A good practice is to add a prefix to the names so that they can be > +distinguished from the original ones, e.g. in a backtrace. Also it > +is usually enough to have a local visibility (static). > + > +The patch contains only functions that are really modified. But they might > +want to access functions or data with local visibility from the original > +source.c file. This can be solved by relocation information. FIXME: > +The support and documentation for relocations is still in progress. > + > + > +5.2. Metadata > +------------ > + > +The patch is described by several structures that split the information > +into three levels: > + > + + struct klp_patch is defined for each patched function. It includes s/klp_patch/klp_func/ > + a name (string) of the original function, optionaly the position > + of the symbol within an object, and a name (pointer) to the new > + function implementation. The old function will be later found via > + kallsyms at runtime. The new function is defined in the same > + source file. > + > + + struct klp_object defines an array of patched functions (struct > + klp_patch) in the same object. Where object is either vmlinux (NULL) s/klp_patch/klp_func/ > + or a module name. It helps to group and handle functions for each > + object together. Note that patched modules might be loaded later > + then the patch itself and the relevant functions might be patched > + only when they are available. > + > + + struct klp_patch defines an array of patched objects (struct > + klp_object). It allows to handle all patched functions consistently > + and synchronously. The whole patch is applied only when all available > + symbols can be patched. This last sentence confuses me. What does "all available symbols" mean? Does it imply a more complex consistency model? > If a more complex consistency model is supported > + then a selected unit (thread, kernel as a whole) will see the new code > + from the entire patch only when they are in a safe state. > + > + > +5.3. Module handling > +------------------- Similar to Jessica's comment, maybe change this header to "Livepatch module handling" to distinguish livepatch modules from patched modules. > + > +The live patch is typically registered and enabled when the module > +is loaded. The reverse operations are called when the module > +is being removed. > + > +IMPORTANT: Livepatch modules could not be removed at the moment. > +See the limitations below. > + > + > +6. Sysfs > +======== > + > +Information about the registered patches might be found under > +/sys/kernel/livepatch. The patches could be enabled and disabled > +by writing there. > + > +See Documentation/ABI/testing/sysfs-kernel-livepatch for more details. > + > + > +7. Limitations > +============== > + > +The initial Livepatch implementation has several limitations: > + > + + The modules with LivePatches could not be removed without forcing > + at the moment. > + > + The problem is how to detect if anyone is still using (sleeping inside) > + a code from the patch. It will get most likely solved once a more complex > + consistency model is supported. The idea is that a safe state for patching > + should also mean a safe state for removing the patch. > + > + > + + Only functions that can be traced could be patched. > + > + Livepatch is based on the dynamic ftrace. In particular, functions > + implementing ftrace or the livepatch ftrace handler could not be patched. > + Otherwise, you would end up in an infinite loop. A potential mistake > + is prevented by marking the problematic functions by "notrace". > + > + > + + Livepatch works reliably only when the dynamic ftrace is located at > + the very beginning of the function. > + > + The function need to be redirected before the stack or the function > + parameters are muffled any way. For example, LivePatch requires > + using -fentry on x86_64. > + > + > + + The patch must not change the semantic of the patched functions. > + > + The current implementation guarantees only that either the old > + or the new function is called. The functions are patched one > + by one. It means that the patch must _not_ change the semantic > + of the function. > + > + > + + Kretprobes using the ftrace framework conflict with the patched functions. > + > + Both Kretprobes and LivePatches use a ftrace handler that modifies > + the return address. The first user wins. Either the probe or the patch > + is rejected when the handler is already in use by the other. > + > + > + + Kprobes in the original function are ignored when the code is redirected > + to the new implementation. > + > + There is a work in progress to add warnings about this situations. > diff --git a/MAINTAINERS b/MAINTAINERS > index 4029c63d8a7d..0e7049688862 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -6590,6 +6590,7 @@ F: kernel/livepatch/ > F: include/linux/livepatch.h > F: arch/x86/include/asm/livepatch.h > F: arch/x86/kernel/livepatch.c > +F: Documentation/livepatch/ > F: Documentation/ABI/testing/sysfs-kernel-livepatch > F: samples/livepatch/ > L: live-patching@xxxxxxxxxxxxxxx > -- > 1.8.5.6 > -- Josh -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html