On 9/1/20 4:17 PM, Lucas De Marchi wrote: > On Tue, Sep 1, 2020 at 12:56 PM Prarit Bhargava <prarit@xxxxxxxxxx> wrote: >> >> >> >> On 9/1/20 2:50 PM, Lucas De Marchi wrote: >>> On Sat, Aug 29, 2020 at 4:15 AM Qu Wenruo <wqu@xxxxxxxx> wrote: >>>> >>>> When kernel module loading failed, user space only get one of the >>>> following error messages: >>>> - -ENOEXEC >>>> This is the most confusing one. From corrupted ELF header to bad >>>> WRITE|EXEC flags check introduced by in module_enforce_rwx_sections() >>>> all returns this error number. >>>> >>>> - -EPERM >>>> This is for blacklisted modules. But mod doesn't do extra explain >>>> on this error either. >>>> >>>> - -ENOMEM >>>> The only error which needs no explain. >>>> >>>> This means, if a user got "Exec format error" from modprobe, it provides >>>> no meaningful way for the user to debug, and will take extra time >>>> communicating to get extra info. >>>> >>>> So this patch will add extra error messages for -ENOEXEC and -EPERM >>>> errors, allowing user to do better debugging and reporting. >>>> >>>> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> >>>> --- >>>> kernel/module.c | 11 +++++++++-- >>>> 1 file changed, 9 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/kernel/module.c b/kernel/module.c >>>> index 1c5cff34d9f2..9f748c6eeb48 100644 >>>> --- a/kernel/module.c >>>> +++ b/kernel/module.c >>>> @@ -2096,8 +2096,12 @@ static int module_enforce_rwx_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs, >>>> int i; >>>> >>>> for (i = 0; i < hdr->e_shnum; i++) { >>>> - if ((sechdrs[i].sh_flags & shf_wx) == shf_wx) >>>> + if ((sechdrs[i].sh_flags & shf_wx) == shf_wx) { >>>> + pr_err( >>>> + "Module %s section %d has invalid WRITE|EXEC flags\n", >>>> + mod->name, i); >>>> return -ENOEXEC; >>>> + } >>>> } >>>> >>>> return 0; >>>> @@ -3825,8 +3829,10 @@ static int load_module(struct load_info *info, const char __user *uargs, >>>> char *after_dashes; >>>> >>>> err = elf_header_check(info); >>>> - if (err) >>>> + if (err) { >>>> + pr_err("Module has invalid ELF header\n"); >>>> goto free_copy; >>>> + } >>>> >>>> err = setup_load_info(info, flags); >>>> if (err) >>>> @@ -3834,6 +3840,7 @@ static int load_module(struct load_info *info, const char __user *uargs, >>>> >>>> if (blacklisted(info->name)) { >>>> err = -EPERM; >>>> + pr_err("Module %s is blacklisted\n", info->name); >>> >>> I wonder why would anyone actually add the blacklist to the command >>> line like this and have no >>> way to revert that back. This was introduced in >> >> Debug. Debug. Debug. ;) The parameter was added to debug broken installations >> and kernel boots. >> >>> be7de5f91fdc modules: Add kernel parameter to blacklist modules >>> as a way to overcome broken initrd generation afaics. >> >> Not the generation of the initramfs, but blocking a module loaded during the >> boot. The installation may have failed and there's no easy way to then debug >> what module was responsible for the failure. > > if you are using initrd, then *inside* the initrd you should have the > /etc/modprobe.d/* file > you want. I said "broken initrd generation" because the tool should > put the file there, and > apparently for you it isn't. > > Even if you don't have the file, you could use modprobe.blacklist= and > let the blacklist happen > in the userspace library rather than in the kernel. Module loading is > not like firmware loading > that happens without help from userspace. > >> >> Either kernel >>> command line (using modprobe.blacklist) >>> or /etc/modprobe.d options are honoured by libkmod and allow a >>> sufficiently privileged user to bypass it. >>> >> >> Both of those options only work if the filesystem is mounted IIRC. It could be >> the case that modprobe.blacklist now works earlier in the boot, however, I've >> never used it because module_blacklist is the biggest and best hammer that I can >> use to get through a broken installation or boot. >> >> In any case you're incorrectly assuming that the system has a filesystem on it. >> That's not necessarily the case as I'll explain below. >> >>> +Rusty, +Prarit: is there anything this module parameter is covering >>> that I'm missing? >> >> This is the situation I have repeatedly come across : A system at a remote site >> will not boot any flavor of Linux. Since the system would not install the only >> way to debug was to provide install images to workaround a module load failure. >> As you can imagine that is a time consuming process as well as a bad end user >> experience. >> >> I got so sick of it that I wrote the code above (well similar to it anyway -- >> thanks for the review Randy ;)) to add the module_blacklist parameter to make >> debugging an uninstallable "bricked" system easier. >> >> It's come in handy in some unexpected ways. We've had situations where modules >> have loaded and corrupted memory and blacklisting them revealed that the modules >> were responsible for the memory corruption. > > ok... but this seems a reimplementation of modprobe.blacklist= option > in the kernel command line, > but in kernel space, with no way to remove it after the kernel is booted. > That's *EXACTLY* what I want. I do not want modprobe (because of some misconfiguration) to load the module I've explicitly blacklisted. P. > Lucas De Marchi > >> >> P. >> >>> >>> For the changes here, >>> >>> Reviewed-by: Lucas De Marchi <lucas.demarchi@xxxxxxxxx> >>> >>> thanks >>> Lucas De Marchi >>> >>>> goto free_copy; >>>> } >>>> >>>> -- >>>> 2.27.0 >>>> >>> >> >