On 2/28/2024 05:54, Hans de Goede wrote:
Hi Daniel, On 2/28/24 03:00, Daniel van Vugt wrote:On 27/2/24 21:47, Hans de Goede wrote:<snip>I think some boot failures also take you to the grub menu automatically next time?In Fedora all boot failures will unhide the grub menu on the next boot. This unfortunately relies on downstream changes so I don't know what Ubuntu does here. <snip>The kernel itself will be quiet as long as you set CONFIG_CONSOLE_LOGLEVEL_QUIET=3 Ubuntu atm has set this to 4 which means any kernel pr_err() or dev_err() messages will get through and since there are quite a few false positives of those Ubuntu really needs to set CONFIG_CONSOLE_LOGLEVEL_QUIET=3 to fix part of: https://bugs.launchpad.net/bugs/1970069Incorrect. In my testing some laptops needed log level as low as 2 to go quiet. And the Ubuntu kernel team is never going to fix all those for non-sponsored devices.Notice that atm Ubuntu's kernel is using the too high CONFIG_CONSOLE_LOGLEVEL_QUIET=4 with CONFIG_CONSOLE_LOGLEVEL_QUIET=3 getting any errors logged to the console should be very very rare. The only thing I can think of is if the kernel oopses / WARN()s early on but the cause is innocent enough that the boot happily continues. In that case actually showing the oops/WARN() is a good thing. For all the years Fedora has had flickerfree boot I have seen zero bug reports about this. If you have examples of this actually being a problem please file bugs for them (launchpad or bugzilla.kernel.org is fine) and then lets take a look at those bugs and fix them. These should be so rare that I'm not worried about this becoming a never ending list of bugs (unlike pr_err() / dev_err() messages of which there are simply too many).I personally own many laptops with so many different boot messages that it's overwhelming for me already to report bugs for each of them. Hence this patch. Also I don't own all the laptops in the world, so fixing all the errors just for my collection wouldn't solve all cases. Whereas this patch does.Almost all of those boot messages are because Ubuntu has set CONFIG_CONSOLE_LOGLEVEL_QUIET too high. Once that is fixed there should be very little of not no messages left. I too own many laptops and I'm not seeing this issue on any of them. You claim you are still seeing errors with CONFIG_CONSOLE_LOGLEVEL_QUIET=3 yet you have not provided a single example!Sorry, but your real problem here seems to be your noisy downstream systemd patch. I'm not going to ack a kernel patch which I consider a bad idea because Ubuntu has a non standard systemd patch which is to trigger happy with spamming the console.Indeed the systemd patch is a big problem. We seem to have had it for 9 years or so. I only just discovered it recently and would love to drop it, but was told we can't. Its main problem is that it uses the console as a communication pipe to plymouth. So simply making it less noisy isn't possible without disabling its functionality. It was seemingly intended to run behind the splash, but since it does fsck it tends to run before the splash (because DRM startup takes a few more seconds).
This comes back to what I was saying before - Ubuntu (and anyone else that wants a flicker free boot for that matter) should adopt simpledrm.
When simpledrm is compiled into the kernel DRM will be up long before the splash screen comes up. When drivers do fastboot (Intel) or seamless (AMD) handoff you /should/ be able to get the splash screen without a modeset.
I think between doing that and changing the default log level not to show console err messages will go a long way.
If there is a concern that people need to see those; how about changing the kernel command line for the recovery kernel so that they only come up in the recovery kernel?
This does indeed sound like it is a non trivial problem to fix, but that is still not a good reason to add this (IMHO) hack to the kernel. The issue deferred fbcon takeover was designed to fix is that the fbcon would mess up the framebuffer contents even if nothing was ever logged to the console. The whole idea being that to still have the fbcon come up as soon as there are any messages. Actively hiding messages was never part of the design, so this is still a NACK from me. Also note that this matches how things work in grub and shim when I first implemented flickerfree boot I also had to patch shim and grub to not make EFI text output protocol calls (including init()) until they actually had some text to show. So the whole design here for shim, grub and the kernel has always been to not mess with the framebuffer until there is some text (any text) to output and then show that text immediately. I do not think that deviating from this design is a good idea. Regards, HansCloses: https://bugs.launchpad.net/bugs/1970069 Cc: Mario Limonciello <mario.limonciello@xxxxxxx> Signed-off-by: Daniel van Vugt <daniel.van.vugt@xxxxxxxxxxxxx> --- drivers/video/fbdev/core/fbcon.c | 32 +++++++++++++++++++++++++++++--- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c index 63af6ab034..5b9f7635f7 100644 --- a/drivers/video/fbdev/core/fbcon.c +++ b/drivers/video/fbdev/core/fbcon.c @@ -76,6 +76,7 @@ #include <linux/crc32.h> /* For counting font checksums */ #include <linux/uaccess.h> #include <asm/irq.h> +#include <asm/cmdline.h>#include "fbcon.h"#include "fb_internal.h" @@ -146,6 +147,7 @@ static inline void fbcon_map_override(void)#ifdef CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVERstatic bool deferred_takeover = true; +static int initial_console = -1; #else #define deferred_takeover false #endif @@ -3341,7 +3343,7 @@ static void fbcon_register_existing_fbs(struct work_struct *work) console_unlock(); }-static struct notifier_block fbcon_output_nb;+static struct notifier_block fbcon_output_nb, fbcon_switch_nb; static DECLARE_WORK(fbcon_deferred_takeover_work, fbcon_register_existing_fbs);static int fbcon_output_notifier(struct notifier_block *nb,@@ -3358,6 +3360,21 @@ static int fbcon_output_notifier(struct notifier_block *nb,return NOTIFY_OK;} + +static int fbcon_switch_notifier(struct notifier_block *nb, + unsigned long action, void *data) +{ + struct vc_data *vc = data; + + WARN_CONSOLE_UNLOCKED(); + + if (vc->vc_num != initial_console) { + dummycon_unregister_switch_notifier(&fbcon_switch_nb); + dummycon_register_output_notifier(&fbcon_output_nb); + } + + return NOTIFY_OK; +} #endifstatic void fbcon_start(void)@@ -3370,7 +3387,14 @@ static void fbcon_start(void)if (deferred_takeover) {fbcon_output_nb.notifier_call = fbcon_output_notifier; - dummycon_register_output_notifier(&fbcon_output_nb); + fbcon_switch_nb.notifier_call = fbcon_switch_notifier; + initial_console = fg_console; + + if (cmdline_find_option_bool(boot_command_line, "splash")) + dummycon_register_switch_notifier(&fbcon_switch_nb); + else + dummycon_register_output_notifier(&fbcon_output_nb); + return; } #endif @@ -3417,8 +3441,10 @@ void __exit fb_console_exit(void) { #ifdef CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER console_lock(); - if (deferred_takeover) + if (deferred_takeover) { dummycon_unregister_output_notifier(&fbcon_output_nb); + dummycon_unregister_switch_notifier(&fbcon_switch_nb); + } console_unlock();cancel_work_sync(&fbcon_deferred_takeover_work);