On Tue, Sep 1, 2020 at 12:56 PM Prarit Bhargava <prarit@xxxxxxxxxx> wrote: > > > > On 9/1/20 2:50 PM, Lucas De Marchi wrote: > > On Sat, Aug 29, 2020 at 4:15 AM Qu Wenruo <wqu@xxxxxxxx> wrote: > >> > >> When kernel module loading failed, user space only get one of the > >> following error messages: > >> - -ENOEXEC > >> This is the most confusing one. From corrupted ELF header to bad > >> WRITE|EXEC flags check introduced by in module_enforce_rwx_sections() > >> all returns this error number. > >> > >> - -EPERM > >> This is for blacklisted modules. But mod doesn't do extra explain > >> on this error either. > >> > >> - -ENOMEM > >> The only error which needs no explain. > >> > >> This means, if a user got "Exec format error" from modprobe, it provides > >> no meaningful way for the user to debug, and will take extra time > >> communicating to get extra info. > >> > >> So this patch will add extra error messages for -ENOEXEC and -EPERM > >> errors, allowing user to do better debugging and reporting. > >> > >> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> > >> --- > >> kernel/module.c | 11 +++++++++-- > >> 1 file changed, 9 insertions(+), 2 deletions(-) > >> > >> diff --git a/kernel/module.c b/kernel/module.c > >> index 1c5cff34d9f2..9f748c6eeb48 100644 > >> --- a/kernel/module.c > >> +++ b/kernel/module.c > >> @@ -2096,8 +2096,12 @@ static int module_enforce_rwx_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, > >> int i; > >> > >> for (i = 0; i < hdr->e_shnum; i++) { > >> - if ((sechdrs[i].sh_flags & shf_wx) == shf_wx) > >> + if ((sechdrs[i].sh_flags & shf_wx) == shf_wx) { > >> + pr_err( > >> + "Module %s section %d has invalid WRITE|EXEC flags\n", > >> + mod->name, i); > >> return -ENOEXEC; > >> + } > >> } > >> > >> return 0; > >> @@ -3825,8 +3829,10 @@ static int load_module(struct load_info *info, const char __user *uargs, > >> char *after_dashes; > >> > >> err = elf_header_check(info); > >> - if (err) > >> + if (err) { > >> + pr_err("Module has invalid ELF header\n"); > >> goto free_copy; > >> + } > >> > >> err = setup_load_info(info, flags); > >> if (err) > >> @@ -3834,6 +3840,7 @@ static int load_module(struct load_info *info, const char __user *uargs, > >> > >> if (blacklisted(info->name)) { > >> err = -EPERM; > >> + pr_err("Module %s is blacklisted\n", info->name); > > > > I wonder why would anyone actually add the blacklist to the command > > line like this and have no > > way to revert that back. This was introduced in > > Debug. Debug. Debug. ;) The parameter was added to debug broken installations > and kernel boots. > > > be7de5f91fdc modules: Add kernel parameter to blacklist modules > > as a way to overcome broken initrd generation afaics. > > Not the generation of the initramfs, but blocking a module loaded during the > boot. The installation may have failed and there's no easy way to then debug > what module was responsible for the failure. if you are using initrd, then *inside* the initrd you should have the /etc/modprobe.d/* file you want. I said "broken initrd generation" because the tool should put the file there, and apparently for you it isn't. Even if you don't have the file, you could use modprobe.blacklist= and let the blacklist happen in the userspace library rather than in the kernel. Module loading is not like firmware loading that happens without help from userspace. > > Either kernel > > command line (using modprobe.blacklist) > > or /etc/modprobe.d options are honoured by libkmod and allow a > > sufficiently privileged user to bypass it. > > > > Both of those options only work if the filesystem is mounted IIRC. It could be > the case that modprobe.blacklist now works earlier in the boot, however, I've > never used it because module_blacklist is the biggest and best hammer that I can > use to get through a broken installation or boot. > > In any case you're incorrectly assuming that the system has a filesystem on it. > That's not necessarily the case as I'll explain below. > > > +Rusty, +Prarit: is there anything this module parameter is covering > > that I'm missing? > > This is the situation I have repeatedly come across : A system at a remote site > will not boot any flavor of Linux. Since the system would not install the only > way to debug was to provide install images to workaround a module load failure. > As you can imagine that is a time consuming process as well as a bad end user > experience. > > I got so sick of it that I wrote the code above (well similar to it anyway -- > thanks for the review Randy ;)) to add the module_blacklist parameter to make > debugging an uninstallable "bricked" system easier. > > It's come in handy in some unexpected ways. We've had situations where modules > have loaded and corrupted memory and blacklisting them revealed that the modules > were responsible for the memory corruption. ok... but this seems a reimplementation of modprobe.blacklist= option in the kernel command line, but in kernel space, with no way to remove it after the kernel is booted. Lucas De Marchi > > P. > > > > > For the changes here, > > > > Reviewed-by: Lucas De Marchi <lucas.demarchi@xxxxxxxxx> > > > > thanks > > Lucas De Marchi > > > >> goto free_copy; > >> } > >> > >> -- > >> 2.27.0 > >> > > >