Re: [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 21, 2020 at 02:14:44PM -0700, Douglas Anderson wrote:
> We want to enable kgdb to debug the early parts of the kernel.
> Unfortunately kgdb normally is a client of the tty API in the kernel
> and serial drivers don't register to the tty layer until fairly late
> in the boot process.
> 
> Serial drivers do, however, commonly register a boot console.  Let's
> enable the kgdboc driver to work with boot consoles to provide early
> debugging.
> 
> This change co-opts the existing read() function pointer that's part
> of "struct console".  It's assumed that if a boot console (with the
> flag CON_BOOT) has implemented read() that both the read() and write()
> function are polling functions.  That means they work without
> interrupts and read() will return immediately (with 0 bytes read) if
> there's nothing to read.  This should be a safe assumption since it
> appears that no current boot consoles implement read() right now and
> there seems no reason to do so unless they wanted to support
> "earlycon_kgdboc".
> 
> The console API isn't really intended to have clients work with it
> like we're doing.  Specifically there doesn't appear to be any way for
> clients to be notified about a boot console being unregistered.  We'll
> work around this by checking that our console is still valid before
> using it.  We'll also try to transition off of the boot console and
> onto the "tty" API as quickly as possible.
> 
> The normal/expected way to make all this work is to use
> "earlycon_kgdboc" and "kgdboc" together.  You should point them both
> to the same physical serial connection.  At boot time, as the system
> transitions from the boot console to the normal console, kgdb will
> switch over.  If you don't use things in the normal/expected way it's
> a bit of a buyer-beware situation.  Things thought about:
> 
> - If you specify only "earlycon_kgdboc" but not "kgdboc" you still
>   might end up dropping into kgdb upon a crash/sysrq but you may not
>   be able to type.
> - If you use "keep_bootcon" (which is already a bit of a buyer-beware
>   option) and specify "earlycon_kgdboc" but not "kgdboc" we'll keep
>   trying to use your boot console for kgdb.
> - If your "earlycon_kgdboc" and "kgdboc" devices are not the same
>   device things should work OK, but it'll be your job to switch over
>   which device you're monitoring (including figuring out how to switch
>   over gdb in-flight if you're using it).
> 
> When trying to enable "earlycon_kgdboc" it should be noted that the
> names that are registered through the boot console layer and the tty
> layer are not the same for the same port.  For example when debugging
> on one board I'd need to pass "earlycon_kgdboc=qcom_geni
> kgdboc=ttyMSM0" to enable things properly.  Since digging up the boot
> console name is a pain and there will rarely be more than one boot
> console enabled, you can provide the "earlycon_kgdboc" parameter
> without specifying the name of the boot console.  In this case we'll
> just pick the first boot that implements read() that we find.
> 
> This new "earlycon_kgdboc" parameter should be contrasted to the
> existing "ekgdboc" parameter.  While both provide a way to debug very
> early, the usage and mechanisms are quite different.  Specifically
> "earlycon_kgdboc" is meant to be used in tandem with "kgdboc" and
> there is a transition from one to the other.  The "ekgdboc" parameter,
> on the other hand, replaces the "kgdboc" parameter.  It runs the same
> logic as the "kgdboc" parameter but just relies on your TTY driver
> being present super early.  The only known usage of the old "ekgdboc"
> parameter is documented as "ekgdboc=kbd earlyprintk=vga".  It should
> be noted that "kbd" has special treatment allowing it to init early as
> a tty device.
> 
> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>

Again, very happy with the overall approach, just a few quibbles.


> ---
> This patch touches files in several different subsystems, but it
> touches a single line and that line is related to kgdb.  I'm assuming
> this can all go through the kgdb tree, but if needed I can always
> introduce a new API call instead of modifying the old one and then
> just have the old API call be a thin wrapper on the new one.

Funny you should say that!

I don't really like that extra argument although it is nothing to do
with simplifying merges...

 
> diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
> index 519d8cfbfbed..2f526f2d2bea 100644
> --- a/drivers/tty/serial/kgdboc.c
> +++ b/drivers/tty/serial/kgdboc.c
> @@ -409,6 +465,80 @@ static int __init kgdboc_early_init(char *opt)
>  }
>  
>  early_param("ekgdboc", kgdboc_early_init);
> +
> +static int earlycon_kgdboc_get_char(void)
> +{
> +	char c;
> +
> +	if (earlycon_neutered || !earlycon->read(earlycon, &c, 1))
> +		return NO_POLL_CHAR;
> +
> +	return c;
> +}
> +
> +static void earlycon_kgdboc_put_char(u8 chr)
> +{
> +	if (!earlycon_neutered)
> +		earlycon->write(earlycon, &chr, 1);
> +}
> +
> +static void earlycon_kgdboc_pre_exp_handler(void)
> +{
> +	/*
> +	 * We don't get notified when the boot console is unregistered.
> +	 * Double-check when we enter the debugger.  Unfortunately we
> +	 * can't really unregister ourselves now, but at least don't crash.
> +	 */
> +	if (earlycon && !earlycon_neutered && !is_earlycon_still_valid()) {
> +		pr_warn("Neutering kgdb since boot console vanished\n");
> +		earlycon_neutered = true;

This is, IMHO, too subtle.

I don't think this is merely a warning with a gentle message about
neutering. IIUC the system is (or will shortly be) dead in the water.
After diligently stopping all the CPUs the debug-core will then start
waiting for a character that cannot possibly come!

I think this might be one of those vanishingly rare places where
panicing might actually the right thing to do... although only after 
neutering" the kgdb panic handler first ;-).


> +	}
> +}
> +
> +static struct kgdb_io earlycon_kgdboc_io_ops = {
> +	.name			= "earlycon_kgdboc",
> +	.read_char		= earlycon_kgdboc_get_char,
> +	.write_char		= earlycon_kgdboc_put_char,
> +	.pre_exception		= earlycon_kgdboc_pre_exp_handler,
> +	.is_console		= true,
> +};
> +
> +static int __init earlycon_kgdboc_init(char *opt)
> +{
> +	struct console *con;
> +
> +	kdb_init(KDB_INIT_EARLY);

This is normally taken care of by debug-core.c . Could this be
integrated into kgdb_register_io_module() ?


> +
> +	/*
> +	 * Look for a matching console, or if the name was left blank just
> +	 * pick the first one we find.
> +	 */
> +	console_lock();
> +	for_each_console(con) {
> +		if (con->write && con->read &&
> +		    (con->flags & (CON_BOOT | CON_ENABLED)) &&
> +		    (!opt || !opt[0] || strcmp(con->name, opt) == 0))
> +			break;
> +	}
> +	console_unlock();
> +
> +	if (!con) {
> +		pr_info("Couldn't find kgdb earlycon\n");
> +		return 0;
> +	}
> +
> +	earlycon = con;
> +	pr_info("Going to register kgdb with earlycon '%s'\n", con->name);
> +	if (kgdb_register_io_module(&earlycon_kgdboc_io_ops, false) != 0) {
> +		earlycon = NULL;
> +		pr_info("Failed to register kgdb with earlycon\n");
> +		return 0;
> +	}
> +
> +	return 0;
> +}
> +
> +early_param("earlycon_kgdboc", earlycon_kgdboc_init);
>  #endif /* CONFIG_KGDB_SERIAL_CONSOLE */
>  
>  module_init(init_kgdboc);
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 8f178239856d..1b5435c6d92a 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -1074,16 +1074,21 @@ EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
>  /**
>   *	kgdb_register_io_module - register KGDB IO module
>   *	@new_dbg_io_ops: the io ops vector
> + *	@replace: If true it's OK if there were old ops.  This is used
> + *		  to transition from early kgdb to normal kgdb.  It's
> + *		  assumed these are the same device so kgdb can continue.
>   *
>   *	Register it with the KGDB core.
>   */
> -int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
> +int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops, bool replace)

As I said I'm not a big fan of the extra argument. It makes the call
sites harder to read.

Could earlycon_kgdboc be registered with a boolean flag set so that
a subsequent register will automatically replace the old one
(maybe "is_replaceable" or "is_temporary")?

For bonus marks the core could also enforce that a replaceable io ops
table must have init set to null (because there is no deinit).


Daniel.



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux