Re: [patch 128/147] init: move usermodehelper_enable() to populate_rootfs()

"H. Peter Anvin" <hpa@xxxxxxxxx> · Fri, 10 Sep 2021 10:47:13 -0700

I feel there is a general problem with the way infrastructure improvements are dealt with in the kernel – basically it feels like the submitter of new infrastructure is expected to convert existing users or "what we have now works." This is a classic way of building tech debt.

Maybe this might be a topic to discuss?

On September 10, 2021 1:12:01 AM PDT, Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> wrote:
>On 08/09/2021 17.44, Luis Chamberlain wrote:
>> On Tue, Sep 07, 2021 at 08:00:03PM -0700, Andrew Morton wrote:
>>> From: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx>
>>> Any call of wait_for_initramfs() done before the unpacking has been
>>> scheduled (i.e. before rootfs_initcall time) must just return
>>> immediately [and let the caller find an empty file system] in order
>>> not to deadlock the machine. I mistakenly thought, and my limited
>>> testing confirmed, that there were no such calls, so I added a
>>> pr_warn_once() in wait_for_initramfs(). It turns out that one can
>>> indeed hit request_module() as well as kobject_uevent_env() during
>>> those early init calls, leading to a user-visible warning in the
>>> kernel log emitted consistently for certain configurations.
>> 
>> Further proof that the semantics for init is still loose. Formalizing
>> dependencies on init is something we should strive to. Eventualy with a
>> DAG.  The linker-tables work I had done years ago strived to get us
>> there which allows us to get a simple explicit DAG through the linker.
>> Unfortunately that patch set fell through because folks were
>> more interested in questioning the alternative side benefits of
>> linker-tables, but the use-case for helping with init is still valid.
>> 
>> If we *do* want to resurrect this folks should let me know.
>
>Heh, a while back I actually had some completely unrelated thing where
>I'd want to make use of the linker tables infrastructure - I remembered
>reading about it on LWN, and was quite surprised when I learnt that that
>work had never made it in. I don't quite remember the use case (I think
>it was for some test module infrastructure). But if you do have time to
>resurrect those patches, I'd certainly be interested.
>
>> Since the kobject_uevent_env() interest here is for /sbin/hotplug and
>> that crap is deprecated, in practice the relevant calls we'd care about
>> are the request_module() calls.
>
>Yes - the first report I got about that pr_warn_once was indeed fixed by
>the reporter simply disabling CONFIG_UEVENT_HELPER
>(https://lore.kernel.org/lkml/9849be80-cfe5-b33e-8224-590a4c451415@xxxxxxxxx/).
>
>>> We could just remove the pr_warn_once(), but I think it's better to
>>> postpone enabling the usermodehelper framework until there is at least
>>> some chance of finding the executable. That is also a little more
>>> efficient in that a lot of work done in umh.c will be elided.
>> 
>> I *don't* think we were aware that such request_module() calls were
>> happening before the fs was even ready and failing silently with
>> -ENOENT. 
>
>Probably not, no, otherwise somebody would have noticed.
>
>As such, although moving the usermodehelper_enable()
>> to right after scheduling populating the rootfs is the right thing,
>> we do loose on the opportunity to learn who were those odd callers
>> before. We could not care... but this is also a missed opportunity
>> in finding those. How important that is, is not clear to me as
>> this was silently failing before...
>> 
>> If we wanted to keep a print for the above purpose though, we'd likely
>> want the full stack trace to see who the hell made the call.
>
>Well, yes, I have myself fallen into that trap not just once, but at
>least twice. The first time when I discovered this behaviour on one of
>the ppc targets I did this work for in the first place (before I came up
>with the CONFIG_MODPROBE_PATH patch). The second when I asked a reporter
>to replace the pr_warn_once by WARN_ONCE:
>
>https://lore.kernel.org/lkml/4434f245-db3b-c02a-36c4-0111a0dfb78d@xxxxxxxxxxxxxxxxxx/
>
>
>The problem is that request_module() just fires off some worker thread
>and then the original calling thread sits back and waits for that worker
>to return a result.
>
>
>>> However,
>>> it does change the error seen by those early callers from -ENOENT to
>>> -EBUSY, so there is a risk of a regression if any caller care about
>>> the exact error value.
>> 
>> I'd see this as a welcomed evolution as it tells us more: we're saying
>> "it's coming, try again" or whatever.
>
>Indeed, and I don't think it's the end of the world if somebody notices
>some change due to that, because we'd learn more about where those early
>request_module() calls come from.
>
>> A debug option to allow us to get a full warning trace in the -EBUSY
>> case on early init would be nice to have.
>
>As noted above, that's difficult. We'd need a way to know which other
>task is waiting for us, then print the trace of that guy.
>
>I don't think anybody is gonna hear this tree falling, so let's not try
>to solve a problem before we know there is one.
>
>Rasmus

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.