Hi Jeffrey, On Mon, Dec 16, 2019 at 11:49:52AM -0700, Jeffrey Hugo wrote: > On Wed, Dec 4, 2019 at 9:13 AM Leo Yan <leo.yan@xxxxxxxxxx> wrote: > > > > On Tue, Dec 03, 2019 at 03:42:31PM -0700, Jeffrey Hugo wrote: > > > > [...] > > > > > > > > This patch fixes the deadlock issue for recursive output; it adds a > > > > > > variable 'curr_user' to indicate the uart port is used by which CPU, if > > > > > > the CPU has acquired spinlock and wants to execute recursive output, > > > > > > it will directly bail out. Here we don't choose to avoid locking and > > > > > > print out log, the reason is in this case we don't want to reset the > > > > > > uart port with function msm_reset_dm_count(); otherwise it can introduce > > > > > > confliction with other flows and results in uart port malfunction and > > > > > > later cannot output anymore. > > > > > > > > > > Is this not fixable? Sure, fixing the deadlock is an improvement, but > > > > > dropping logs (particularly a memory warning like in your example) > > > > > seems undesirable. > > > > > > > > Thanks a lot for your reviewing, Jeffrey. > > > > > > > > Agreed with you for the concern. > > > > > > > > To be honest, I am not familiar with the msm uart driver, so have no > > > > confidence which is the best way for uart port operations. I can > > > > think out one possible fixing is shown in below, if detects the lock > > > > is not acquired then it will force to reset UART port before exit the > > > > function __msm_console_write(). > > > > > > > > This approach is not tested yet and it looks too arbitrary; I will > > > > give a try for it. At the meantime, welcome any insight suggestion > > > > with proper register operations. > > > > > > According to the documentation, NCF_TX is only needed for SW transmit > > > mode, where software is directly puttting characters in the fifo. Its > > > not needed for BAM mode. According to your example, recursive console > > > printing will only happen in BAM mode, and not in SW mode. Perhaps if > > > we put the NCF_TX uses to just the SW mode, we avoid the issue and can > > > allow recursive printing? > > > > Thanks for the suggestion! But based on the suggestion, I tried to > > change code as below, the console even cannot work when boot the > > kernel: > > > > static void msm_reset_dm_count(struct uart_port *port, int count) > > { > > + u32 val; > > + > > msm_wait_for_xmitr(port); > > - msm_write(port, count, UARTDM_NCF_TX); > > - msm_read(port, UARTDM_NCF_TX); > > + > > + val = msm_read(port, UARTDM_DMEN); > > + > > + /* > > + * NCF is only enabled for SW transmit mode and is > > + * skipped for BAM mode. > > + */ > > + if (!(val & UARTDM_DMEN_TX_BAM_ENABLE) && > > + !(val & UARTDM_DMEN_RX_BAM_ENABLE)) { > > + msm_write(port, count, UARTDM_NCF_TX); > > + msm_read(port, UARTDM_NCF_TX); > > + } > > } > > > > > > Alternatively, when exit from __msm_console_write() and if detect the > > case for without acquiring spinlock, invoke msm_wait_for_xmitr() to wait > > for transmit completion looks a good candidate solution. The updated > > patch is as below. Please let me know if this is doable? > > > > diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c > > index 1db79ee8a886..aa6a494c898d 100644 > > --- a/drivers/tty/serial/msm_serial.c > > +++ b/drivers/tty/serial/msm_serial.c > > @@ -190,6 +190,7 @@ struct msm_port { > > bool break_detected; > > struct msm_dma tx_dma; > > struct msm_dma rx_dma; > > + struct cpumask curr_user; > > }; > > > > #define UART_TO_MSM(uart_port) container_of(uart_port, struct msm_port, uart) > > @@ -440,6 +441,7 @@ static void msm_complete_tx_dma(void *args) > > u32 val; > > > > spin_lock_irqsave(&port->lock, flags); > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > > > /* Already stopped */ > > if (!dma->count) > > @@ -474,6 +476,7 @@ static void msm_complete_tx_dma(void *args) > > > > msm_handle_tx(port); > > done: > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > spin_unlock_irqrestore(&port->lock, flags); > > } > > > > @@ -548,6 +551,7 @@ static void msm_complete_rx_dma(void *args) > > u32 val; > > > > spin_lock_irqsave(&port->lock, flags); > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > > > /* Already stopped */ > > if (!dma->count) > > @@ -594,6 +598,7 @@ static void msm_complete_rx_dma(void *args) > > > > msm_start_rx_dma(msm_port); > > done: > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > spin_unlock_irqrestore(&port->lock, flags); > > > > if (count) > > @@ -932,6 +937,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > > u32 val; > > > > spin_lock_irqsave(&port->lock, flags); > > + cpumask_set_cpu(smp_processor_id(), &msm_port->curr_user); > > misr = msm_read(port, UART_MISR); > > msm_write(port, 0, UART_IMR); /* disable interrupt */ > > > > @@ -963,6 +969,7 @@ static irqreturn_t msm_uart_irq(int irq, void *dev_id) > > msm_handle_delta_cts(port); > > > > msm_write(port, msm_port->imr, UART_IMR); /* restore interrupt */ > > + cpumask_clear_cpu(smp_processor_id(), &msm_port->curr_user); > > spin_unlock_irqrestore(&port->lock, flags); > > > > return IRQ_HANDLED; > > @@ -1573,10 +1580,12 @@ static inline struct uart_port *msm_get_port_from_line(unsigned int line) > > static void __msm_console_write(struct uart_port *port, const char *s, > > unsigned int count, bool is_uartdm) > > { > > + struct msm_port *msm_port = UART_TO_MSM(port); > > int i; > > int num_newlines = 0; > > bool replaced = false; > > void __iomem *tf; > > + int locked = 1; > > > > if (is_uartdm) > > tf = port->membase + UARTDM_TF; > > @@ -1589,7 +1598,15 @@ static void __msm_console_write(struct uart_port *port, const char *s, > > num_newlines++; > > count += num_newlines; > > > > - spin_lock(&port->lock); > > + if (port->sysrq) > > + locked = 0; > > + else if (oops_in_progress) > > + locked = spin_trylock(&port->lock); > > + else if (cpumask_test_cpu(smp_processor_id(), &msm_port->curr_user)) > > + locked = 0; > > + else > > + spin_lock(&port->lock); > > + > > if (is_uartdm) > > msm_reset_dm_count(port, count); > > > > @@ -1625,7 +1642,12 @@ static void __msm_console_write(struct uart_port *port, const char *s, > > iowrite32_rep(tf, buf, 1); > > i += num_chars; > > } > > - spin_unlock(&port->lock); > > + > > + if (!locked) > > + msm_wait_for_xmitr(port); > > Sorry, catching up from some travel. > > I don't understand this. At this point, haven't we already called > msm_reset_dm_count() and "corrupted" the state of the hardware? Yeah, at here msm_reset_dm_count() has been called. msm_wait_for_xmitr() is used to wait for completing transmition. So we can get flow as: msm_complete_tx_dma() kmalloc() fail __msm_console_write() msm_reset_dm_count() output logs msm_wait_for_xmitr() => ensure to not impact out flow My essential reason for adding msm_wait_for_xmitr() is to cleanup the "corrupted" state before return to out flow. Thanks, Leo Yan > > + > > + if (locked) > > + spin_unlock(&port->lock); > > }