Re: kernel locks due to USB I/O

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The objective of this email is to report the current status of my findings.

I loaded netconsole on both machines I was having problems with. I tried 3 times on the machine with kernel 5.0.0-37 and twice with on the machine with kernel 5.3.0-62. Each attempt consisted of running the program which lock the kernel until it locked (about 3 minutes after stating the program). The referred program had the "semphore code" commented out. Nothing was sent to netconsole on all the 5 attempts I made when the kernel locked.

Just to be clear about my use of netconsole, before loading the netconsole kernel module, I ran "dmesg -n 8". When netconsole module was loaded I could clearly see about 9 message lines on the computer receiving the netconsole messages telling me that netconsole was loaded (and how it was configured), so no doubts about the correct netconsole setup. The "netconsole server" was a machine on the same local network.

My next attempt will be to compile kernel 5.9, as you suggest, and try it.

Thanks,

Alberto Sentieri

On 11/11/20 10:51 AM, Alan Stern wrote:
On Tue, Nov 10, 2020 at 06:42:17PM -0500, Alberto Sentieri wrote:
1) The current Ubuntu Kernel is 5.4.0-53. Do you want me to upgrade it to
5.9, from kernel.org? Or is there a Ubuntu 5.9 package that I can use? It
would be easy to do it If there is a Ubuntu package with 5.9, which I would
install and, after the tests, uninstall.
If you want to know what Ubuntu packages are available, you should ask
on an Ubuntu mailing list instead of the linux-usb mailing list.

2) Why do you believe that 5.9 would solve the problem? I am asking that
because I cannot change the production machine for a test if I cannot go
back to the original state. There is always a risk involved.
We do not believe that 5.9 will solve the problem -- we have no reason
to believe this -- but we could be wrong.  In any case it is always
best to test with the most up-to-date software available, and 5.9 is the
version closest to what we are working on now.

3) It is one single thread dealing with all 36 devices. Each device has its
own co-routine (not preemptive), but all co-routines are executed by a
unique thread.
If everything runs within a single thread, how can adding a semaphore
or mutex make any difference?

4) By network console, do you mean ssh? It dies as well when it locks. The
screen is the regular GNOME3 screen and nothing can be seen there. Every
time it locks they send a picture, and I cannot see anything meaningful
there. I am thinking about disabling GNOME3, but I need their blessing for
that.
See https://www.kernel.org/doc/Documentation/networking/netconsole.txt
for instructions on netconsole.  And when you use it for testing, be
sure to set the console log level to a high value.

Alan Stern



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux