On Fri, Apr 01, 2016 at 09:35:34PM +0200, Jiri Kosina wrote: > On Fri, 1 Apr 2016, Chris J Arges wrote: > > > Loading, please wait... > > starting version 229 > > [ 1.182869] random: udevadm urandom read with 2 bits of entropy available > > [ 1.241404] BUG: unable to handle kernel paging request at ffffffffc000f35f > > Gah, we surely can't change pages with RO PTE. Thanks for such a prompt > testing. You do have CONFIG_DEBUG_SET_MODULE_RONX set, don't you? > > The patch below should fix that by marking the module RO (and relevant > parts NX) only when it's guaranteed that .text is not going to be modified > any more (and includes the error handling fix Miroslav spotted as well). > > Thanks. > > > > diff --git a/kernel/module.c b/kernel/module.c > index 5f71aa6..430606d 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -3211,7 +3211,7 @@ int __weak module_finalize(const Elf_Ehdr *hdr, > return 0; > } > > -static int post_relocation(struct module *mod, const struct load_info *info) > +static void post_relocation(struct module *mod, const struct load_info *info) > { > /* Sort exception table now relocations are done. */ > sort_extable(mod->extable, mod->extable + mod->num_exentries); > @@ -3222,9 +3222,6 @@ static int post_relocation(struct module *mod, const struct load_info *info) > > /* Setup kallsyms-specific fields. */ > add_kallsyms(mod, info); > - > - /* Arch-specific module finalizing. */ > - return module_finalize(info->hdr, info->sechdrs, mod); > } > > /* Is this module of this name done loading? No locks held. */ > @@ -3441,10 +3438,6 @@ static int complete_formation(struct module *mod, struct load_info *info) > /* This relies on module_mutex for list integrity. */ > module_bug_finalize(info->hdr, info->sechdrs, mod); > > - /* Set RO and NX regions */ > - module_enable_ro(mod); > - module_enable_nx(mod); > - > /* Mark state as coming so strong_try_module_get() ignores us, > * but kallsyms etc. can see us. */ > mod->state = MODULE_STATE_COMING; > @@ -3562,9 +3555,7 @@ static int load_module(struct load_info *info, const char __user *uargs, > if (err < 0) > goto free_modinfo; > > - err = post_relocation(mod, info); > - if (err < 0) > - goto free_modinfo; > + post_relocation(mod, info); > > flush_module_icache(mod); > > @@ -3589,6 +3580,15 @@ static int load_module(struct load_info *info, const char __user *uargs, > if (err) > goto bug_cleanup; > > + /* Arch-specific module finalizing. */ > + err = module_finalize(info->hdr, info->sechdrs, mod); > + if (err) > + goto coming_cleanup; > + > + /* Set RO and NX regions */ > + module_enable_ro(mod); > + module_enable_nx(mod); > + > /* Module is ready to execute: parsing args may do that. */ > after_dashes = parse_args(mod->name, mod->args, mod->kp, mod->num_kp, > -32768, 32767, mod, So I think this doesn't fix the problem. Dynamic relocations are applied to the "patch module", whereas the above code deals with the initialization order of the "patched module". This distinction originally confused me as well, until Jessica set me straight. Let me try to illustrate the problem with an example. Imagine you have a patch module P which applies a patch to module M. P replaces M's function F with a new function F', which uses paravirt ops. 1) Patch P is loaded before module M. P's new function F' has an instruction which is patched by apply_paravirt(), even though the patch hasn't been applied yet. 2) Module M is loaded. Before applying the patch, livepatch tries to apply a klp_reloc to the instruction in F' which was already patched by apply_paravirt() in step 1. This results in undefined behavior because it tries to patch the original instruction but instead patches the new paravirt instruction. So the above patch makes no difference because the paravirt module loading order doesn't really matter. Jessica proposed some novel fixes here: https://github.com/dynup/kpatch/issues/580#issuecomment-183001652 But I get the feeling that any fix would be quite ugly and brittle. I think the *real* problem here (and one that we've seen before) is that we have a feature which allows you to load a patch to a module before loading the module itself. That really goes against the grain of how module dependencies work. It has already given us several headaches and it makes the livepatch code a lot more complex. I really think we need to take another hard look about whether it's really worth it. My current feeling is that it's not. If we were able to get rid of that "feature", yes, the livepatch code would be simpler, but there might be another awesome benefit: I suspect we'd also be able to get rid of the need for specialized patch creation tooling like kpatch-build. Instead I think we could just specify klp_relocs info in the source code of the patch, and just use kbuild to build the patch module. Not only would the livepatch code be simpler (and much easier to wrap your head around), but the user space tooling could be *vastly* simpler. Of course, removing that feature might create some headaches for the user. It is nice to be able to load a big cumulative patch without having to load all the dependencies first. But maybe there are things we could do to make the dependency problem more manageable. e.g., splitting up patch modules to be per-object? requiring the user to load modules they don't need? patching or replacing the module on disk? copying the new module to a new locaiton and telling modprobe where to find it? I don't have all the answers but I think we should take a hard look at some of these other approaches. -- Josh -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html