Re: [PATCH printk v2 08/26] printk: nbcon: Implement processing in port->lock wrapper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 2024-03-11 18:14:19, John Ogness wrote:
> Hi Petr,
> 
> Thanks for the detailed feedback. Here is a lengthy response. I hope it
> clarifies the uart port and console fields. And I think you identified a
> bug relating to the setup() callback.
> 
> On 2024-02-23, Petr Mladek <pmladek@xxxxxxxx> wrote:
> > My main (only) concern was the synchronization of the various accessed
> > variables, especially, port->cons.
> 
> The only issue is if port->cons disappears between lock and unlock. I
> see there is code setting port->cons to NULL, although I do not know
> why. Once port->cons is set, there is never a reason to unset it.

I wonder if it might be needed for hotplugging of the device
or the driver. Well, I would expect that the structures won't exist
when the driver is not loaded and/or the device providing
the port does not exist.

But maybe, it is just a defensive programming style where unused
pointers are cleared.

> Regardless, I will add port->lock synchronization when modifying
> port->cons. There are only a few code sites and they are all during
> driver setup.
> 
> > Note: I am not completely sure how the early and valid console drivers
> >       share the same struct uart_port. So, maybe I miss some important
> >       guarantee.
> 
> The struct uart_port is _not_ shared between the early and normal
> consoles. However, the struct console is shared for normal consoles
> amoung various ports of a particular driver.

I see.

> > Anyway. synchronization of port->cons looks like a shade area.
> > IMHO, the existing code expects that it is used _only when the console
> > is registered_. But this patch wants to access it _even before
> > the console is registered_!
> 
> Indeed. It is not enough for uart_is_nbcon() to check if it is an
> NBCON. It must also check if it is registered, locklessly:
> 
>     hlist_unhashed_lockless(&con->node);
> 
> Most importantly to be sure that nbcon_init() has already been called.
> register_console() clears the nbcon state after cons->index has been
> set, but before the console has been added to the list.

Makes sense.

Well, it brings another question. Does this allow to have
the following situation?

CPU0				CPU1

  some_function()
    uart_port_lock()
      // locked just with up->lock
      // doing something with the port

				register_console()
				  // add struct console using the same
				  // port as CPU0
				  printk()
				    console_try_lock()
				    console_unlock()
				      console_flush_all()
					// acquire context for the newly
					// registered nbcon
					nbcon_context_try_acquire(ctxt)
					  con->write()

BANG: Both CPU0 and CPU1 are writing to the same port.

Reason: CPU0 locked only via port->lock.
	CPU1 locked only by acquiring nbcon context.

Maybe, this is not possible because the console is registered when
the struct uart_port is being initialized and nobody could
use the same port in parallel, except for the early console.
Where the early console is serialized using the console_lock().

Hmm, if the parallel use of struct port_lock is not possible
during register_console then we probably do not even need
to serialize setting and clearing of port->cons.

By other words, I wonder if printk() is the only nasty user of
the uart ports. By "nasty user" I mean:

  + Using the same port even without struct uart_port by
    early console.

  + Using the port via struct uart_port even before the device
    is completely initialized. I assume that register_console()
    is called from the driver init call.

For example, the device/port gets visible in sysfs. I wonder
if anyone could trigger an operation on the port which
it is being registered as a console.

Sigh, if we agree that the above race is possible then I can't think
of any elegant solution.

Well, it might be better to be on the safe side:

One solution would be to add nbcon consoles into the console_list
under uart_port_lock().

Another solution would be to make sure that any code serialized
by uart_port_lock() will be already synchronized by nbcon context
while the nbcon is added into the console_list. Maybe, we
could do this in con->setup() callback. Something like:

 * @need_nbcon_context: true when uart_port_lock() has to acquire
	nbcon context as well

struct uart_port {
	bool need_nbcon_context;

int uart_console_setup(struct console *con, char *options)
{
	struct uart_8250_port *up = &serial8250_ports[co->index];

	spin_lock_irq(&up->lock);
	up->need_nbcon_context=true;
	spin_unlock_irq(&up->lock);

	// do whatever the original uart_console_setup did
}

int uart_console_exit(struct console *con, char *options)
{
	struct uart_8250_port *up = &serial8250_ports[co->index];

	// do whatever the original uart_console_exit did

	spin_lock_irq(&up->lock);
	up->need_nbcon_context=false;
	spin_unlock_irq(&up->lock);
}

We might even use this variable in uart_nbcon_acquire() to
decide whether we need to acquire the context or not.


> > For example, it took me quite a lot of time to shake my head around:
> >
> > #define uart_console(port) \
> > 	((port)->cons && (port)->cons->index == (port)->line)
> >
> >   + port->cons and port->line are updated in the uart code.
> >     It seems that the update is not serialized by port->lock.
> >     Something might be done under port->mutex.
> >
> >   + cons->index is updated in register_console() under
> >     console_list_lock.
> >
> > Spoiler: I propose a solution which does not use uart_console().
> 
> This check is necessary because multiple ports of a driver will set and
> share the same port->cons value, even if they are not really the
> console. I looked into enforcing that port->cons is NULL if it is not a
> registered console, but this is tricky. port->cons is driver-internal
> and hidden from printk. The driver will set port->cons in case it
> becomes the console and printk will set cons->index once it has chosen
> which port will be the actual console. But there is no way to unset
> port->cons if a port was not chosen by printk.
> 
> The various fields have the following meaning (AFAICT):
> 
> port->line: An identifier to represent a particular port supported by a
> driver.
> 
> port->cons: The struct console to use if this port is chosen to be a
> console.
> 
> port->console: Boolean, true if this port was chosen to be a
> console. (Used only by the tty layer.)
> 
> cons->index: The port chosen by printk to be a console.
> 
> None of those fields specify if the port is currently registered as a
> console. For that you would need to check if port->cons->node is hashed
> and then verify that port->line matches port->cons->index. This is what
> uart_nbcon_acquire() needs to do (as well as check if it is an nbcon
> console).

This is a great description. It would be great to have it somewhere in
the sources. Maybe, above the locking/acquire functions.

> >> --- a/drivers/tty/serial/8250/8250_port.c
> >> +++ b/drivers/tty/serial/8250/8250_port.c
> >> @@ -3284,6 +3284,7 @@ void serial8250_init_port(struct uart_8250_port *up)
> >>  	struct uart_port *port = &up->port;
> >>  
> >>  	spin_lock_init(&port->lock);
> >> +	port->nbcon_locked_port = false;
> >
> > I am not sure if the variable really helps anything:
> >
> >    + nbcon_context release() must handle the case when it
> >      lost the ownership anyway.
> 
> uart_nbcon_release() must first check if the provided port is a
> registered nbcon console. A simple boolean check is certainly faster
> than the 4 checks mentioned above. After all, if it was never locked,
> there is no reason to unlock it.

Fair enough.

> > We just need to make sure that port->cons can be cleared only
> > under port->lock. But this would be required even with
> > port->nbcon_locked_port.
> 
> Agreed. I will add this.
> 
> >> --- a/kernel/printk/nbcon.c
> >> +++ b/kernel/printk/nbcon.c
> >> +void uart_nbcon_acquire(struct uart_port *up)
> >> +{
> >> +	struct console *con = up->cons;
> >> +	struct nbcon_context ctxt;
> >
> > I would add:
> >
> > 	lockdep_assert_held(&up->lock);
> 
> OK.
> 
> >> +
> >> +	if (!uart_is_nbcon(up))
> >> +		return;
> >> +
> >> +	WARN_ON_ONCE(up->nbcon_locked_port);
> >> +
> >> +	do {
> >> +		do {
> >> +			memset(&ctxt, 0, sizeof(ctxt));
> >> +			ctxt.console	= con;
> >> +			ctxt.prio	= NBCON_PRIO_NORMAL;
> >> +		} while (!nbcon_context_try_acquire(&ctxt));
> >> +
> >> +	} while (!nbcon_context_enter_unsafe(&ctxt));
> >> +
> >> +	up->nbcon_locked_port = true;
> >> +}
> >> +EXPORT_SYMBOL_GPL(uart_nbcon_acquire);
> >
> > I would prefer to split the uart and nbcon specific code, for example:
> 
> Can you explain why? This code will not be used anywhere else.

It would have helped me with the review. The function is short
but it touches internals from both uart_port and nbcon words:

  + Implements new variant of nbcon_ctxt_acquire() API (busy loop, no timeout)

  + Checks and modifies struct uart_port details which affect
    uart_port_lock() API.

IMHO, there is too much to think about in a single function ;-)


Best Regards,
Petr




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux