On 08/03/2015 09:48 AM, David Jander wrote: > On Mon, 03 Aug 2015 08:06:03 -0400 > Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote: >> On 08/03/2015 02:20 AM, David Jander wrote: >>> On Fri, 31 Jul 2015 18:03:26 -0400 >>> Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote: >>>> On 07/28/2015 10:07 AM, David Jander wrote: >>>>> On Tue, 28 Jul 2015 08:23:13 -0400 >>>>> Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote: >>>>>> On 07/28/2015 02:07 AM, David Jander wrote: >>>>>>> On Mon, 27 Jul 2015 15:15:59 -0300 >>>>>>> Fabio Estevam <fabio.estevam@xxxxxxxxxxxxx> wrote: >>>>>>> >>>>>>>> This reverts commit e95044ba4fee93f5ea8a1a24b2d921e148503833. >>>>>>>> >>>>>>>> Commit e95044ba4fee93 ("tty: serial: imx.c: Reset UART before >>>>>>>> activating interrupts") terribly messes up with the console on mx6 >>>>>>>> boards, so let's revert it. >>>>>>> >>>>>>> IMHO Spewing out a few corrupt characters during startup (which >>>>>>> curiously also appear sometimes due to other reasons), is a lot less >>>>>>> of a problem that a hanging system! >>>>>> >>>>>> Seems you already know the symptoms. Was the console corruption a known >>>>>> issue before submitting? I didn't see anything in the changelog or >>>>>> discussion at submission time? >>>>> >>>>> Honestly I must admit that I have not cared enough about these symptoms >>>>> until now, although I remember regularly seeing a few garbled characters >>>>> coming from the UART during kernel boot, and I thought those might be >>>>> related to early console handover or something similar. I remember seeing >>>>> this since long before DAM support was added for i.MX6. >>>>> >>>>>>> Please don't revert this patch without providing a fix for the crash! >>>>>> >>>>>> This clearly isn't the correct fix, and it makes no sense to build a >>>>>> proper fix on top of a broken one. >>>>> >>>>> If I understand you correctly, you refer to resetting the UART at driver >>>>> init, while it is already transmitting due to something like >>>>> early-console? Is the UART driver really initialized twice for that? >>>>> Because otherwise I don't see how this could do any harm... >>>>> The driver clearly assumes the UART is coming straight out of reset, so >>>>> if it does not, it needs to be reset before the driver can safely do >>>>> anything. Please explain how this can in any way _not_ be the correct >>>>> solution (or at least part of it). >>>> >>>> There are many reasons not to do it this way: >>>> 1. You're disabling the early console. >>> >>> You might have a point here, but I have seen that imx_startup() is also >>> called later on during boot, >> >> ... when setserial or getty start and the tty is opened (which may not occur >> in other configurations) ... > > What I see when starting the kernel (console messages get printed), the moment > the UART which is also used as console is initialized, I see that only > "imx_set_termios()" gets called and then this message gets logged: > > [ 1.322839] console [ttymxc3] enabled > > This is the moment the kernel log buffer gets dumped, but imx_startup() is > never called at this time. imx_startup() only gets called once right before > mounting the rootfs and a few times after that (opening console from > user-space I assume). Right, that's how that works. A console can actually be initialized much earlier than driver probe with a console_initcall() declaration, whereas normal driver initialization takes place on first tty open. The UART driver startup() method is first called from tty_open() => uart_open() => uart_startup() => uart_port_startup(). > This means, that if we reset the uart in imx_startup(), we also need to do it > in imx_console_setup(). Yes. >>> which would also result in a hickup of the console >>> (which might explain the garbled characters I sometimes see during boot). >>> In a regular case, I'd say there should be some sort of hand-over mechanism >>> between the early-console driver and the real imx uart driver, >> >> There's no requirement that an early console must later become a console. > > That's a good point. > This means that no UART driver is allowed to mess with it's registers in the > probe function, right? Generally true. If possible, I would expect serial drivers to defer hardware enable until the hardware is actually required, either because the tty has been opened (startup()) or the console was registered (console setup()). >>> but since the >>> former is only meant as a low-level debug tool anyway... >> >> Early console is already used on production machines, for better or worse. > > I see. > >>>> 2. You're providing a crutch for a broken idea; if some other code in the >>>> driver is expecting the hardware to be initialized before the tty is >>>> opened (other than console which needs to self-sufficient), that code >>>> needs to blow up so we can find it and fix it. >>> >>> What you are saying, is exactly what happened. I discovered this problem >>> while debugging kexec crashes. This is the fix. >>> And btw, the console is actually what blew up. >> >> The changelog mentions nothing about console blowing up; the problem the >> changelog describes is that this driver enabled its irq handler prematurely, >> that's why it blew up, which I already explained how to fix. >> >> If something else is broken, please fix that in a separate patch. > > Ok, now I understand your point. > >>>> 3. It adds significant maintenance burden to have 1 driver do a >>>> fundamental operation differently than the other 100 drivers here. >>> >>> Hmmm... you pretend to say the other 100 drivers also won't work reliably >>> in a kexec scenario? Well, let's start hunting them down and fix them >>> too ;-) >> >> I think you'll find that doesn't happen with other drivers because they >> enable irq in startup(), not probe(), and the console is initialized to the >> minimum extent necessary to provide polled mode i/o. > > I see. > >>>>>> This time I would look at moving the request_irq() into the startup() >>>>>> method like every other serial driver does. >>>>> >>>>> This sounds indeed like a good solution to avoid the interrupt race, but >>>>> IMHO my patch is a better solution, because the driver still assumes the >>>>> UART comes straight out of reset. This is evident by the fact that the >>>>> driver only ever changes register settings that are not reset defaults. >>>>> This driver is clearly not meant to be used with an UART that has been >>>>> used before the driver initializes it for the first time. The only robust >>>>> fix for that is to either forcibly reset the UART at the beginning or to >>>>> fix the driver in that it makes no assumption about historic register >>>>> settings, which will lead to basically resetting it at startup anyway. >>>> >>>> Which is exactly what it already does: imx_startup() => imx_reset(). >>> >>> No. I'm afraid it doesn't. >>> imx_reset() alone does not actually reset the imx uart. It just resets the >>> fifo state-machine. >> >> Ok; seems that the function is mis-named then. > > Kind of... but the register bit in question is also called "software reset". > >>> Just resetting the state-machine is fine for imx_startup() >>> if you know you initialized the UART control registers to your knowledge >>> and liking before that, but it is not enough if the uart comes from a >>> totally unknown state (which happens in the kexec scenario or when using >>> some wild bootloader). >> >> Ok, but you're initializing it to a useless known state. >> >> The correct initial state for hardware is either >> 1. the initial termios that the driver declares when it registers, or >> 2. the line settings specified with the console. >> >> The first is done on first open, the second on console register. >> >>> If you are saying that resetting a device in the probe function is not the >>> right way to ensure it is in a known state, then please explain how it >>> should be done correctly and I'll gladly try again. >> >> If the request_irq() is moved into startup(), then it seems the only >> issue is the proper initialization of UCR1 and UCR2, for both startup() and >> console setup. > > Ok, now I see why. Thanks for clarifying. > >> What purpose does the RMW of UCR1 and UCR2 serve currently in imx_startup()? >> What bits is that trying to preserve? ISTM, imx_startup() should be >> initializing those registers. > > Both imx_startup() and imx_console_setup() I guess, right? Yeah, although I'm kind of wondering why completely reset the console? If there's data from the previous now-dead kernel stuck in the fifo, maybe you want to see that data pushed out? (just thinking out loud here) Maybe just init the console enough to guarantee UART mode, dma off, etc? >> Console initialization is performed in imx_console_setup(). What console >> options are using and how are you specifying them (command line or device >> tree)? > > Command-line: console=ttymxc3,115200 > > Thanks. I'll try again, avoiding to touch the UART in probe. > Does anyone here object to fully initializing the UART in imx_startup() and > imx_console_setup()? I have no objection to that (but see thinking-out-loud above). Regards, Peter Hurley -- To unsubscribe from this list: send the line "unsubscribe linux-serial" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html