Re: [PATCH] serial: Fail TIOCMGET/SET if port is suspended

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/23/2015 12:46 PM, Gabriel Krisman Bertazi wrote:
> Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> writes:
> 
>> Hi Gabriel,
>>
>> On 11/11/2015 09:18 AM, Peter Hurley wrote:
>>> On 11/11/2015 08:15 AM, Gabriel Krisman Bertazi wrote:
>>>> When a PCI error is detected, reset code in 8250_pci.c suspends the uart
>>>> ports to block IO while recovery procedure is performed.  Nevertheless,
>>>> TIOCMGET and TIOCMSET ioctls are still able to reach the device even
>>>> when it is suspended, which might cause EEH recovery of serial adapters
>>>> to fail on Power systems.  This patch blocks any TIOCMGET/TIOCMSET calls
>>>> from reaching the device if the port is suspended, to avoid racing with
>>>> the recovery procedure.
>>>
>>> This really should be handled by the 8250_pci sub-driver ticking
>>> TTY_IO_ERROR so that almost all tty operations fail (a minor bug fix
>>> to properly order the clearing of ASYNC_INITIALIZED w/ setting TTY_IO_ERROR
>>> in uart_shutdown() would be req'd).
>>>
>>> Unfortunately the machinery to do that doesn't exist.
>>> I could trivially add a means given a uart_port ptr but the difficulty
>>> of _safely_ getting a uart_port from line index still remains;
>>> serial8250_get_port() should have just added a full ref count mechanism.
>>
>> Re-checking the 8250_pci sub-driver, I see it goes right through
>> serial8250_suspend_port() which assumes the 8250 port will not be unregistered
>> concurrently, so I assume the PCI subsystem serializes i/o error handling
>> with device removal, in which case the translation from
>> line => uart_8250_port => uart_port is straightforward.
>>
>> Below is the serial core machinery that can be used to tick TTY_IO_ERROR.
> 
> Sorry for the late response.
> 
> I've been testing both solutions for a while now and they still doesn't
> seem to do a good job on stopping access to a device being recovered.
> Ticking TTY_IO_ERROR caused my test program to reopen the port, which
> went straight to serial8250_do_startup, accessed the device and crashed
> the recovery again.

Ah, sorry. Didn't realize that was part of the design requirements.


> I fixed that, but other places also access the
> device even though TTY_IO_ERROR is set.  Since
> uart_startup/uart_shutdown sets TTY_IO_ERROR as well, we'd need to check
> if we're already in an error condition during startup/shutdown, and
> abort initialization/shutdown.  Considering all of that, I'm starting to
> second guess the idea of using TTY_IO_ERROR for this purpose.
> 
> I'm considering another fix that calls pciserial_remove_ports when
> detecting a PCI error, which would eventually call uart_remove_one_port,
> and actually detach the port from the serial layer.  Then, I can
> register the port again after recovery, similarly to what is done in
> pciserial_init_ports during probe.  This is how other serial driver
> (jsm) handles PCI errors.

That sounds worth exploring.

You should be aware that uart_remove_one_port() is not completely
bulletproof; for example, if the device is /dev/console with open file
descriptors, vhangup() will return with still open, valid file descriptors
and the port will not have been shutdown, but the hard detach proceeds
anyway.

The most serious issue stems from how a port is unlinked from the serial
core; the ptr is simply reset, despite it may be in-use at the time. No ref
counting or locking is performed (the lock is embedded in the port structure).

It's on my todo list to correct these problems, but I haven't yet.
A recent patch series of mine, "Fix driver crashes on hangup", makes
concurrent access during unlink much more unlikely, but does not eliminate
the problem.

Just thought you'd want to know.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux