On Mon, Jun 24, 2024 at 02:58:39PM -0700, Doug Anderson wrote: > On Mon, Jun 24, 2024 at 2:23 PM Doug Anderson <dianders@xxxxxxxxxxxx> wrote: > > On Mon, Jun 24, 2024 at 6:31 AM Johan Hovold <johan+linaro@xxxxxxxxxx> wrote: > > > +static void qcom_geni_serial_clear_tx_fifo(struct uart_port *uport) > > > +{ > > > + struct qcom_geni_serial_port *port = to_dev_port(uport); > > > + > > > if (!qcom_geni_serial_main_active(uport)) > > > return; > > > > > > + /* > > > + * Increase watermark level so that TX can be restarted and wait for > > > + * sequencer to start to prevent lockups. > > > + */ > > > + writel(port->tx_fifo_depth, uport->membase + SE_GENI_TX_WATERMARK_REG); > > > + qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS, > > > + M_TX_FIFO_WATERMARK_EN, true); > > > > Oh, maybe this "wait for sequencer to start to prevent lockups." is > > the part that I was missing? Can you explain more about what's going > > on here? Why does waiting for the watermark interrupt to fire prevent > > lockups? I would have imagined that the watermark interrupt would be > > part of the geni hardware and have nothing to do with the firmware > > running on the other end, so I'm not sure why it firing somehow would > > prevent a lockup. Was this just by trial and error? > > Actually, the more I look at it the more confused I am about your > qcom_geni_serial_clear_tx_fifo(). Can you explain and maybe add some > inline comments in the function since it's not obvious? Specifically, > things I'm confused about with your patch: > > 1. The function is named qcom_geni_serial_clear_tx_fifo() which > implies that when it finishes that the hardware FIFO will have nothing > in it. ...but how does your code ensure this? Yeah, I realised after I sent out the series that this may not be the case. I was under the impression that cancelling a command would discard the data in the FIFO (e.g. when starting the next command) but that was probably an error in my mental model. Do you see any way to discard the FIFO in the docs you have access to? > 2. If the function is really clearing the FIFOs then why do we need to > adjust the watermark level? The fact that you need to adjust the > watermark levels implies (to me) that there are things stuck in the > FIFO still. ...but then what happens to those characters? When are > they sent? Exactly, there is data there according to the FIFO status, but I erroneously interpreted it as a it would be discarded (e.g. when starting the next command). > 3. On my hardware you're setting the FIFO level to 16 here. The docs I > have say that if the FIFO level is "less than" the value you set here > then the interrupt will go off and further clarifies that if you set > the register to 1 here then you'll get interrupted when the FIFO is > empty. So what happens with your solution if the FIFO is completely > full? In that case you'd have to set this to 17, right? ...but then I > could believe that might confuse the interrupt handler which would get > told to start transmitting when there is no room for anything. Indeed. I may implicitly be relying on the absence of hardware flow control as well so that waiting for one character to be sent is what makes this work. > Maybe something is missing in my mental model here and testing your > patch and hitting Ctrl-C seems to work, but I don't really understand > why so hopefully you can clarify! :-) I spent too much time trying to reverse engineer this hw over the weekend and wanted to get this out as a counter proposal as it indicated that we may be able to find a smaller fix. The series addresses the serial getty issues, but as I mentioned yesterday there is some interaction with the console left to be resolved before this can be an alternative. If it wasn't for the hard lockup I would have sent the whole thing as an RFC. Johan